Just another Cron alternative with a Web UI, but with much more capabilities
It runs DAGs (Directed acyclic graph) defined in a simple YAML format.
Dagu is a tool for scheduling and running tasks based on a directed acyclic graph (DAG). It allows you to define dependencies between commands and represent them as a single DAG, schedule the execution of DAGs with Cron expressions, and natively support running Docker containers, making HTTP requests, and executing commands over SSH.
- Single binary file installation
- Declarative YAML format for defining DAGs
- Web UI for visualizing, managing, and rerunning pipelines
- No programming required, making it easy to use and ideal for small projects
- Self-contained, with no need for a DBMS or cloud service
- Highlights
- Contents
- Getting started
- Motivation
- Why not use an existing workflow scheduler like Airflow?
- How does it work?
- Installation
- ️Quick start
- Command Line User Interface
- Web User Interface
- YAML Format
- Executors
- Admin Configuration
- Environment Variable
- Sending email notifications
- Base Configuration for all DAGs
- Scheduler
- Running with Docker Compose
- Building Docker Image
- REST API Interface
- Local Development Setup
- FAQ
- How to contribute?
- Where is the history data stored?
- Where are the log files stored?
- How long will the history data be stored?
- How to use specific
hostandportfordagu server? - How to specify the DAGs directory for
dagu serveranddagu scheduler? - How can I retry a DAG from a specific task?
- How does it track running processes without DBMS?
- Contributions
- License
To get started with Dagu, see the installation instructions below and then check out the ️Quick start guide.
Legacy systems often have complex and implicit dependencies between jobs. When there are hundreds of cron jobs on a server, it can be difficult to keep track of these dependencies and to determine which job to rerun if one fails. It can also be a hassle to SSH into a server to view logs and manually rerun shell scripts one by one. Dagu aims to solve these problems by allowing you to explicitly visualize and manage pipeline dependencies as a DAG, and by providing a web UI for checking dependencies, execution status, and logs and for rerunning or stopping jobs with a simple mouse click.
There are many existing tools such as Airflow, Prefect, and Temporal, but many of these require you to write code in a programming language like Python to define your DAG. For systems that have been in operation for a long time, there may already be complex jobs with hundreds of thousands of lines of code written in languages like Perl or Shell Script. Adding another layer of complexity on top of these codes can reduce maintainability. Dagu was designed to be easy to use, self-contained, and require no coding, making it ideal for small projects.
Dagu is a single command line tool that uses the local file system to store data, so no database management system or cloud service is required. DAGs are defined in a declarative YAML format, and existing programs can be used without modification.
You can install Dagu quickly using Homebrew or by downloading the latest binary from the Releases page on GitHub.
brew install yohamta/tap/daguUpgrade to the latest version:
brew upgrade yohamta/tap/dagucurl -L https://raw.githubusercontent.com/yohamta/dagu/main/scripts/downloader.sh | bashdocker run \
--rm \
-p 8080:8080 \
-v $HOME/.dagu/dags:/home/dagu/.dagu/dags \
-v $HOME/.dagu/data:/home/dagu/.dagu/data \
-v $HOME/.dagu/logs:/home/dagu/.dagu/logs \
yohamta/dagu:latestDownload the latest binary from the Releases page and place it in your $PATH (e.g. /usr/local/bin).
Start the server with dagu server and browse to http://127.0.0.1:8080 to explore the Web UI.
Create a DAG by clicking the New DAG button on the top page of the web UI. Input example in the dialog.
Note: DAG (YAML) files will be placed in ~/.dagu/dags by default. See Admin Configuration for more details.
Go to the SPEC Tab and hit the Edit button. Copy & Paste this example YAML and click the Save button.
You can execute the example by pressing the Start button.
Note: Leave the parameter field in the dialog blank and press OK.
dagu start [--params=<params>] <file>- Runs the DAGdagu status <file>- Displays the current status of the DAGdagu retry --req=<request-id> <file>- Re-runs the specified DAG rundagu stop <file>- Stops the DAG execution by sending TERM signalsdagu restart <file>- Restarts the current running DAGdagu dry [--params=<params>] <file>- Dry-runs the DAGdagu server [--host=<host>] [--port=<port>] [--dags=<path/to/the DAGs directory>]- Launches the Dagu web UI serverdagu scheduler [--dags=<path/to/the DAGs directory>]- Starts the scheduler processdagu version- Shows the current binary version
The --config=<config> option is available to all commands. It allows to specify different dagu configuration for the commands. Which enables you to manage multiple dagu process in a single instance. See Admin Configuration for more details.
For example:
dagu server --config=~/.dagu/dev.yaml
dagu scheduler --config=~/.dagu/dev.yaml-
DAGs: It shows all DAGs and the real-time status.
-
DAG Details: It shows the real-time status, logs, and DAG configurations. You can edit DAG configurations on a browser.
You can switch to the vertical graph with the button on the top right corner.
-
Search DAGs: It greps given text across all DAGs.
-
Execution History: It shows past execution results and logs.
-
DAG Execution Log: It shows the detail log and standard output of each execution and step.
To view all examples, visit this page.
The minimal DAG definition is as simple as follows.
steps:
- name: step 1
command: echo hello
- name: step 2
command: echo world
depends:
- step 1script field provides a way to run arbitrary snippets of code in any language.
steps:
- name: step 1
command: "bash"
script: |
cd /tmp
echo "hello world" > hello
cat hello
output: RESULT
- name: step 2
command: echo ${RESULT} # hello world
depends:
- step 1You can define environment variables and refer to them using the env field.
env:
- SOME_DIR: ${HOME}/batch
- SOME_FILE: ${SOME_DIR}/some_file
steps:
- name: some task in some dir
dir: ${SOME_DIR}
command: python main.py ${SOME_FILE}You can define parameters using the params field and refer to each parameter as $1, $2, etc. Parameters can also be command substitutions or environment variables. It can be overridden by the --params= parameter of the start command.
params: param1 param2
steps:
- name: some task with parameters
command: python main.py $1 $2Named parameters are also available as follows.
params: ONE=1 TWO=`echo 2`
steps:
- name: some task with parameters
command: python main.py $ONE $TWOYou can use command substitution in field values. I.e., a string enclosed in backquotes (`) is evaluated as a command and replaced with the result of standard output.
env:
TODAY: "`date '+%Y%m%d'`"
steps:
- name: hello
command: "echo hello, today is ${TODAY}"Sometimes you have parts of a DAG that you only want to run under certain conditions. You can use the preconditions field to add conditional branches to your DAG.
For example, the task below only runs on the first date of each month.
steps:
- name: A monthly task
command: monthly.sh
preconditions:
- condition: "`date '+%d'`"
expected: "01"If you want the DAG to continue to the next step regardless of the step's conditional check result, you can use the continueOn field:
steps:
- name: A monthly task
command: monthly.sh
preconditions:
- condition: "`date '+%d'`"
expected: "01"
continueOn:
skipped: trueThe output field can be used to set an environment variable with standard output. Leading and trailing space will be trimmed automatically. The environment variables can be used in subsequent steps.
steps:
- name: step 1
command: "echo foo"
output: FOO # will contain "foo"The stdout field can be used to write standard output to a file.
steps:
- name: create a file
command: "echo hello"
stdout: "/tmp/hello" # the content will be "hello\n"The stderr field allows to redirect stderr to other file without writing to the normal log file.
steps:
- name: output error file
command: "echo error message >&2"
stderr: "/tmp/error.txt"It is often desirable to take action when a specific event happens, for example, when a DAG fails. To achieve this, you can use handlerOn fields.
handlerOn:
failure:
command: notify_error.sh
exit:
command: cleanup.sh
steps:
- name: A task
command: main.shIf you want a task to repeat execution at regular intervals, you can use the repeatPolicy field. If you want to stop the repeating task, you can use the stop command to gracefully stop the task.
steps:
- name: A task
command: main.sh
repeatPolicy:
repeat: true
intervalSec: 60Combining these settings gives you granular control over how the DAG runs.
name: all configuration # Name (optional, default is filename)
description: run a DAG # Description
schedule: "0 * * * *" # Execution schedule (cron expression)
group: DailyJobs # Group name to organize DAGs (optional)
tags: example # Free tags (separated by comma)
env: # Environment variables
- LOG_DIR: ${HOME}/logs
- PATH: /usr/local/bin:${PATH}
logDir: ${LOG_DIR} # Log directory to write standard output, default: ${DAGU_HOME}/logs/dags
restartWaitSec: 60 # Wait 60s after the process is stopped, then restart the DAG.
histRetentionDays: 3 # Execution history retention days (not for log files)
delaySec: 1 # Interval seconds between steps
maxActiveRuns: 1 # Max parallel number of running step
params: param1 param2 # Default parameters that can be referred to by $1, $2, ...
preconditions: # Precondisions for whether the it is allowed to run
- condition: "`echo $2`" # Command or variables to evaluate
expected: "param2" # Expected value for the condition
mailOn:
failure: true # Send a mail when the it failed
success: true # Send a mail when the it finished
MaxCleanUpTimeSec: 300 # The maximum amount of time to wait after sending a TERM signal to running steps before killing them
handlerOn: # Handlers on Success, Failure, Cancel, and Exit
success:
command: "echo succeed" # Command to execute when the execution succeed
failure:
command: "echo failed" # Command to execute when the execution failed
cancel:
command: "echo canceled" # Command to execute when the execution canceled
exit:
command: "echo finished" # Command to execute when the execution finished
steps:
- name: some task # Step name
description: some task # Step description
dir: ${HOME}/logs # Working directory (default: the same directory of the DAG file)
command: bash # Command and parameters
stdout: /tmp/outfile
ouptut: RESULT_VARIABLE
script: |
echo "any script"
signalOnStop: "SIGINT" # Specify signal name (e.g. SIGINT) to be sent when process is stopped
mailOn:
failure: true # Send a mail when the step failed
success: true # Send a mail when the step finished
continueOn:
failure: true # Continue to the next regardless of the step failed or not
skipped: true # Continue to the next regardless the preconditions are met or not
retryPolicy: # Retry policy for the step
limit: 2 # Retry up to 2 times when the step failed
intervalSec: 5 # Interval time before retry
repeatPolicy: # Repeat policy for the step
repeat: true # Boolean whether to repeat this step
intervalSec: 60 # Interval time to repeat the step in seconds
preconditions: # Precondisions for whether the step is allowed to run
- condition: "`echo $1`" # Command or variables to evaluate
expected: "param1" # Expected Value for the conditionThe global configuration file ~/.dagu/config.yaml is useful to gather common settings, such as logDir or env.
The executor field provides different execution methods for each step.
Note: It requires Docker daemon running on the host.
The docker executor allows us to run Docker containers instead of bare commands.
In the example below, it pulls and runs Deno's docker image and prints 'Hello World'.
steps:
- name: deno_hello_world
executor:
type: docker
config:
image: "denoland/deno:1.10.3"
autoRemove: true
command: run https://examples.deno.land/hello-world.tsExample Log output:
To see more configurations, visit this page.
The http executor allows us to make an arbitrary HTTP request.
steps:
- name: send POST request
executor: http
command: POST https://foo.bar.com
script: |
{
"timeout": 10,
"headers": {
"Authorization": "Bearer $TOKEN"
},
"query": {
"key": "value"
},
"body": "post body"
} The ssh executor allows us to execute commands on remote hosts over SSH.
steps:
- name: step1
executor:
type: ssh
config:
user: dagu
ip: XXX.XXX.XXX.XXX
port: 22
key: /Users/dagu/.ssh/private.pem
command: /usr/sbin/ifconfigTo configure dagu, please create the config file (default path: ~/.dagu/admin.yaml). All fields are optional.
# Web Server Host and Port
host: <hostname for web UI address> # default: 127.0.0.1
port: <port number for web UI address> # default: 8080
# path to the DAGs directory
dags: <the location of DAG configuration files> # default: ${DAGU_HOME}/dags
# Web UI Color & Title
navbarColor: <admin-web header color> # header color for web UI (e.g. "#ff0000")
navbarTitle: <admin-web title text> # header title for web UI (e.g. "PROD")
# Basic Auth
isBasicAuth: <true|false> # enables basic auth
basicAuthUsername: <username for basic auth of web UI> # basic auth user
basicAuthPassword: <password for basic auth of web UI> # basic auth password
# Base Config
baseConfig: <base DAG config path> . # default: ${DAGU_HOME}/config.yaml
# Others
logDir: <internal logdirectory> # default: ${DAGU_HOME}/logs/admin
command: <Absolute path to the dagu binary> # default: daguYou can configure the dagu's internal work directory by defining DAGU_HOME environment variables. The default path is ~/.dagu/.
Email notifications can be sent when a DAG finished with an error or successfully. To do so, you can set the smtp field and related fields in the DAG specs. You can use any email delivery services (e.g. Sendgrid, Mailgun, etc).
# Eamil notification settings
mailOn:
failure: true
success: true
# SMTP server settings
smtp:
host: "smtp.foo.bar"
port: "587"
username: "<username>"
password: "<password>"
# Error mail configuration
errorMail:
from: "[email protected]"
to: "[email protected]"
prefix: "[Error]"
# Info mail configuration
infoMail:
from: "[email protected]"
to: "[email protected]"
prefix: "[Info]"If you want to use the same settings for all DAGs, set them to the base configuration.
Creating a base configuration (default path: ~/.dagu/config.yaml) is a convenient way to organize shared settings among all DAGs. The path to the base configuration file can be configured. See Admin Configuration for more details.
# directory path to save logs from standard output
logDir: /path/to/stdout-logs/
# history retention days (default: 30)
histRetentionDays: 3
# Eamil notification settings
mailOn:
failure: true
success: true
# SMTP server settings
smtp:
host: "smtp.foo.bar"
port: "587"
username: "<username>"
password: "<password>"
# Error mail configuration
errorMail:
from: "[email protected]"
to: "[email protected]"
prefix: "[Error]"
# Info mail configuration
infoMail:
from: "[email protected]"
to: "[email protected]"
prefix: "[Info]"To run DAGs automatically, you need to run the dagu scheduler process on your system.
You can specify the schedule with cron expression in the schedule field in the config file as follows.
schedule: "5 4 * * *" # Run at 04:05.
steps:
- name: scheduled job
command: job.shOr you can set multiple schedules.
schedule:
- "30 7 * * *" # Run at 7:30
- "0 20 * * *" # Also run at 20:00
steps:
- name: scheduled job
command: job.shIf you want to start and stop a long-running process on a fixed schedule, you can define start and stop times as follows. At the stop time, each step's process receives a stop signal.
schedule:
start: "0 8 * * *" # starts at 8:00
stop: "0 13 * * *" # stops at 13:00
steps:
- name: scheduled job
command: job.shYou can also set multiple start/stop schedules. In the following example, the process will run from 0:00-5:00 and 12:00-17:00.
schedule:
start:
- "0 0 * * *"
- "12 0 * * *"
stop:
- "5 0 * * *"
- "17 0 * * *"
steps:
- name: some long-process
command: main.shIf you want to restart a DAG process on a fixed schedule, the restart field is also available. At the restart time, the DAG execution will be stopped and restarted again.
schedule:
start: "0 8 * * *" # starts at 8:00
restart: "0 12 * * *" # restarts at 12:00
stop: "0 13 * * *" # stops at 13:00
steps:
- name: scheduled job
command: job.shThe wait time after the job is stopped before restart can be configured in the DAG definition as follows. The default value is 0 (zero).
restartWaitSec: 60 # Wait 60s after the process is stopped, then restart the DAG.
steps:
- name: step1
command: python some_app.pyThe easiest way to make sure the process is always running on your system is to create the script below and execute it every minute using cron (you don't need root account in this way).
#!/bin/bash
process="dagu scheduler"
command="/usr/bin/dagu scheduler"
if ps ax | grep -v grep | grep "$process" > /dev/null
then
exit
else
$command &
fi
exitSet the dags field to specify the directory of the DAGs.
dags: <the location of DAG configuration files> # default: (~/.dagu/dags)To automate workflows based on cron expressions, it is necessary to run both the admin server and scheduler process. Here is an example docker-compose.yml setup for running Dagu using Docker Compose.
version: "3.9"
services:
# init container updates permission
init:
image: "yohamta/dagu:latest"
user: root
volumes:
- data:/home/dagu/.dagu/data
- logs:/home/dagu/.dagu/logs
command: chown -R dagu /home/dagu/.dagu/
# admin web server process
server:
image: "yohamta/dagu:latest"
restart: unless-stopped
ports:
- "8080:8080"
volumes:
- data:/home/dagu/.dagu/data
- logs:/home/dagu/.dagu/logs
- <path to config files>:/home/dagu/.dagu
- <path to dag files>:/home/dagu/.dagu/dags
depends_on:
- init
# scheduler process
scheduler:
image: "yohamta/dagu:latest"
restart: unless-stopped
volumes:
- data:/home/dagu/.dagu/data
- logs:/home/dagu/.dagu/logs
- <path to config files>:/home/dagu/.dagu
- <path to dag files>:/home/dagu/.dagu/dags
command: dagu scheduler
depends_on:
- init
volumes:
data: {}
logs: {}Download the Dockerfile to your local PC and you can build an image.
For example:
DAGU_VERSION=1.9.0
docker build -t dagu:${DAGU_VERSION} \
--build-arg VERSION=${DAGU_VERSION} \
--no-cache .Please refer to REST API Docs
npm i -g yarn- Build frontend project
make build-admin- Build
dagubinary tobin/dagu
make buildFeel free to contribute in any way you want. Share ideas, questions, submit issues, and create pull requests. Thanks!
It will store execution history data in the DAGU__DATA environment variable path. The default location is $HOME/.dagu/data.
It will store log files in the DAGU__LOGS environment variable path. The default location is $HOME/.dagu/logs. You can override the setting by the logDir field in a YAML file.
The default retention period for execution history is 30 days. However, you can override the setting by the histRetentionDays field in a YAML file.
dagu server's host and port can be configured in the admin configuration file as below. See Admin Configuration for more details.
host: <hostname for web UI address> # default: 127.0.0.1
port: <port number for web UI address> # default: 8000You can customize DAGs directory that will be used by dagu server and dagu scheduler. See Admin Configuration for more details.
dags: <the location of DAG configuration files> # default: ${DAGU_HOME}/dagsYou can change the status of any task to a failed state. Then, when you retry the DAG, it will execute the failed one and any subsequent.
dagu uses Unix sockets to communicate with running processes.
We welcome contributions to Dagu! If you have an idea for a new feature or have found a bug, please open an issue on the GitHub repository. If you would like to contribute code, please follow these steps:
- Fork the repository
- Create a new branch for your changes
- Make your changes and commit them to your branch
- Push your branch to your fork and open a pull request
This project is licensed under the GNU GPLv3 - see the LICENSE.md file for details