Your filesystem has doppelgรคngers. Letโs hunt.
doppel is a blazing-fast, concurrent CLI tool written in Go for scanning directories and finding duplicate files, aka doppelgรคngers! ๐ต๏ธโโ๏ธ๐๏ธ
Save disk space and keep your filesystem clean by quickly identifying and managing duplicate files. Doppel is designed for speed, flexibility, and reliability.
- ๐ Table of Contents
- โก๏ธ Quick Start
- ๐ฎ Terminal Preview
- โจ Features
- ๐ฆ Installation
- ๐ Usage
- ๐งฌ How It Works
- ๐๏ธ Development
- ๐ License
- ๐ค Contributing
Install (requires Go 1.25+):
go install github.com/dr8co/doppel@latestScan your home directory for duplicates:
doppel find ~Or use a preset for common scenarios:
doppel preset media ~/Pictures- โก๏ธ Fast scanning with parallel hashing (Blake3, configurable workers)
- ๐ Flexible filtering by file size, glob patterns, and regular expressions
- ๐ Noise reduction with path and file exclusions
- ๐ Detailed statistics and verbose output
- ๐ ๏ธ Dry-run mode to preview filters
- ๐ Structured output for easy integration with other tools. Supported formats:
- JSON
- YAML
- Text (default)
- ๐งฉ Extensible presets for common use cases (media, dev, docs, clean)
- ๐งช Tested with unit tests and integration tests
- ๐ป Cross-platform: Works on Linux, macOS, and Windows
- ๐ ๏ธ Automatic completion for bash, zsh, fish, and PowerShell
- ๐ Structured logging for better automation, debugging, and monitoring. Formats:
- JSON
- Text
- Pretty (default)
With Go:
go install github.com/dr8co/doppel@latestFrom source:
git clone https://github.com/dr8co/doppel.git
cd doppel
go build -o doppel main.goPre-built binaries:
See the ๐ releases page.
Doppel provides a simple CLI interface. The main command is doppel,
with subcommands for different operations.
doppel [global options] [command [command options]]Run doppel --help to see global options and available commands.
Note
Running doppel with no command defaults to find.
Doppel supports configuration through TOML (recommended), YAML, or JSON files. Configuration files are automatically loaded from:
$CONFIG_DIR/doppel/config.toml$CONFIG_DIR/doppel/config.yaml$CONFIG_DIR/doppel/config.json$CONFIG_DIR/doppel/config(Assume TOML if no extension)
where $CONFIG_DIR is your system's user configuration directory:
- Linux:
~/.config - macOS:
~/Library/Application Support - Windows:
%AppData% - Plan 9:
~/lib - Other Unix:
~/.config
The configuration files can be used to set default values for any command-line options.
The key names in the configuration file match the long option names for each command,
with dashes replaced with underscores.
For example, to set the default minimum file size for the find command to 1.5MB,
you would add the following to your TOML configuration file:
[find]
min_size = "1.5MB"For more details on the TOML format, see the TOML spec.
Note
Command-line arguments take precedence over configuration file values.
Doppel also supports configuration through environment variables. Environment variable names are derived from the command and option names, with the following rules:
- The prefix
DOPPEL_is added to all environment variable names. - The command name is added after the prefix (if applicable).
- The option name is added after the command name.
- All names are converted to uppercase.
- Dashes (
-) in option names are replaced with underscores (_).
For example, to set the default minimum file size for the find command to 1.5MB,
you would set the following environment variable:
DOPPEL_FIND_MIN_SIZE=1.5MBNote
Environment variables take precedence over configuration file values, but are overridden by command-line arguments.
Doppel supports automatic completion for various shells. To generate completion scripts, run:
doppel completion <shell>Where <shell> is one of: bash, zsh, fish, or pwsh.
This will print the completion script to stdout. You can redirect it to a file or source it directly in your shell.
Scan for duplicate files in the current directory:
doppel find
# or simply
doppelScan specific directories:
doppel find /path/to/dir1 /path/to/dir2-w, --workers <n>: Number of parallel hashing workers (default: number of CPUs)-v, --verbose: Enable verbose output--min-size <size>: Minimum file size to consider (default: 0 = no limit)--max-size <size>: Maximum file size to consider (default: 0 = no limit)--exclude-dirs <patterns>: Comma-separated glob patterns for directories to exclude--exclude-files <patterns>: Comma-separated glob patterns for files to exclude--exclude-dirs-regex <regexes>: Comma-separated regex patterns for directories to exclude--exclude-files-regex <regexes>: Comma-separated regex patterns for files to exclude--show-filters: Show active filters and exit--output-format <format>: Output format for duplicate groups (default: pretty, options:pretty,json,yaml)--output-file <file>: Write output to a file instead of stdout
For more details, run:
doppel find --help
# or
doppel find helpExamples:
Find duplicates in ~/Downloads and ~/Documents, excluding .git directories and files smaller than 1MB:
doppel find ~/Downloads ~/Documents --exclude-dirs=.git --min-size=1000000 --verbose
# or
doppel find ~/Downloads ~/Documents --exclude-dirs=.git --min-size=1MB --verbose--min-size and --max-size support the following formats:
- Bytes:
100,100B,100bare all equivalent - Kilobytes:
10KB,10kB,10Kb,10kb,10000are all equivalent - Kibibytes:
10KiB,10kiB,10KIB,10240are all equivalent - Megabytes:
1MB,1mB,1Mb,1mb,1000000are all equivalent - Mebibytes:
1MiB,1miB,1MIB. (same as1048576) - Gigabytes:
1GB,1gB,1Gb,1gb. (1000000000) - Gibibytes:
1GiB,1giB,1gIB. (1073741824) - Terabytes:
1TB,1tB,1Tb,1tb. (1000000000000) - Tebibytes:
1TiB,1tiB,1TIb. (1099511627776) - Petabytes:
1PB,1pB,1Pb,1pb. (1000000000000000) - Pebibytes:
1PiB,1piB,1PIB. (1125899906842624) - Exabytes:
1EB,1eB,1Eb,1eb. (1000000000000000000) - Exbibytes:
1EiB,1eiB,1EIB. (1152921504606846976)
Find duplicates in /var/logs, excluding all .log files and directories starting with temp,
and ignoring empty files:
doppel find /var/logs --min-size=1 --exclude-files="*.log" --exclude-dirs="temp*" # Be sure to quote patterns!Note
When using glob patterns and regexes, be sure to quote (and escape, if necessary) them to prevent shell expansion.
Use presets for common duplicate-hunting scenarios:
dev: Skip development directories and files (e.g., build, temp, version control)media: Focus on media files (images/videos), skip small filesdocs: Focus on document filesclean: Skip temporary and cache files
Usage:
doppel preset <preset> [options]Where <preset> is one of: dev, media, docs, or clean.
Preset options are the same as for find.
Example:
Find duplicate media files in your ~/Pictures folder:
doppel preset media ~/Pictures- File Discovery: Recursively scans specified directories (and their subdirectories), applying filters.
- Grouping: Groups files by size to quickly eliminate non-duplicates.
- Hashing: Computes Blake3 hashes for files with matching sizes.
- Reporting: Displays groups of duplicate files and optional statistics.
-
๐ Code is organized in
cmd/,internal/, andassets/directories. -
๐งฉ Uses urfave/cli/v3 for CLI parsing.
-
๐ Uses blake3 for fast hashing.
-
๐งช Run tests with:
go test -race -v ./...
This project is licensed under the MIT License. See LICENSE for details.
Contributions, issues, and feature requests are welcome! Please open an issue or pull request on GitHub.
See CONTRIBUTING.md for guidelines.
doppel โ Find your duplicate files, fast and reliably. โจ