Skip to content

dr8co/doppel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

doppel logo

Your filesystem has doppelgรคngers. Letโ€™s hunt.

Made with Go Platform GitHub go.mod Go version GitHub Actions CI test Go report License


doppel is a blazing-fast, concurrent CLI tool written in Go for scanning directories and finding duplicate files, aka doppelgรคngers! ๐Ÿ•ต๏ธโ€โ™‚๏ธ๐Ÿ—‚๏ธ

Save disk space and keep your filesystem clean by quickly identifying and managing duplicate files. Doppel is designed for speed, flexibility, and reliability.


๐Ÿ“‹ Table of Contents

โšก๏ธ Quick Start

Install (requires Go 1.25+):

go install github.com/dr8co/doppel@latest

Scan your home directory for duplicates:

doppel find ~

Or use a preset for common scenarios:

doppel preset media ~/Pictures

๐Ÿ”ฎ Terminal Preview

terminal preview

โœจ Features

  • โšก๏ธ Fast scanning with parallel hashing (Blake3, configurable workers)
  • ๐Ÿ” Flexible filtering by file size, glob patterns, and regular expressions
  • ๐Ÿ”‡ Noise reduction with path and file exclusions
  • ๐Ÿ“Š Detailed statistics and verbose output
  • ๐Ÿ› ๏ธ Dry-run mode to preview filters
  • ๐Ÿ“„ Structured output for easy integration with other tools. Supported formats:
    • JSON
    • YAML
    • Text (default)
  • ๐Ÿงฉ Extensible presets for common use cases (media, dev, docs, clean)
  • ๐Ÿงช Tested with unit tests and integration tests
  • ๐Ÿ’ป Cross-platform: Works on Linux, macOS, and Windows
  • ๐Ÿ› ๏ธ Automatic completion for bash, zsh, fish, and PowerShell
  • ๐Ÿ“œ Structured logging for better automation, debugging, and monitoring. Formats:
    • JSON
    • Text
    • Pretty (default)

๐Ÿ“ฆ Installation

With Go:

go install github.com/dr8co/doppel@latest

From source:

git clone https://github.com/dr8co/doppel.git
cd doppel
go build -o doppel main.go

Pre-built binaries:

See the ๐Ÿš€ releases page.

๐Ÿš€ Usage

๐Ÿ› ๏ธ Command-Line Interface

Doppel provides a simple CLI interface. The main command is doppel, with subcommands for different operations.

doppel [global options] [command [command options]]

Run doppel --help to see global options and available commands.

Note

Running doppel with no command defaults to find.

โš™๏ธ Configuration Files

Doppel supports configuration through TOML (recommended), YAML, or JSON files. Configuration files are automatically loaded from:

  • $CONFIG_DIR/doppel/config.toml
  • $CONFIG_DIR/doppel/config.yaml
  • $CONFIG_DIR/doppel/config.json
  • $CONFIG_DIR/doppel/config (Assume TOML if no extension)

where $CONFIG_DIR is your system's user configuration directory:

  • Linux: ~/.config
  • macOS: ~/Library/Application Support
  • Windows: %AppData%
  • Plan 9: ~/lib
  • Other Unix: ~/.config

The configuration files can be used to set default values for any command-line options.

The key names in the configuration file match the long option names for each command, with dashes replaced with underscores. For example, to set the default minimum file size for the find command to 1.5MB, you would add the following to your TOML configuration file:

[find]
min_size = "1.5MB"

For more details on the TOML format, see the TOML spec.

Note

Command-line arguments take precedence over configuration file values.

Environment Variables

Doppel also supports configuration through environment variables. Environment variable names are derived from the command and option names, with the following rules:

  • The prefix DOPPEL_ is added to all environment variable names.
  • The command name is added after the prefix (if applicable).
  • The option name is added after the command name.
  • All names are converted to uppercase.
  • Dashes (-) in option names are replaced with underscores (_).

For example, to set the default minimum file size for the find command to 1.5MB, you would set the following environment variable:

DOPPEL_FIND_MIN_SIZE=1.5MB

Note

Environment variables take precedence over configuration file values, but are overridden by command-line arguments.

Automatic Completion

Doppel supports automatic completion for various shells. To generate completion scripts, run:

doppel completion <shell>

Where <shell> is one of: bash, zsh, fish, or pwsh.

This will print the completion script to stdout. You can redirect it to a file or source it directly in your shell.

๐Ÿ”Ž Find Command

Scan for duplicate files in the current directory:

doppel find
# or simply
doppel

Scan specific directories:

doppel find /path/to/dir1 /path/to/dir2

โš™๏ธ Find Command Options

  • -w, --workers <n>: Number of parallel hashing workers (default: number of CPUs)
  • -v, --verbose: Enable verbose output
  • --min-size <size>: Minimum file size to consider (default: 0 = no limit)
  • --max-size <size>: Maximum file size to consider (default: 0 = no limit)
  • --exclude-dirs <patterns>: Comma-separated glob patterns for directories to exclude
  • --exclude-files <patterns>: Comma-separated glob patterns for files to exclude
  • --exclude-dirs-regex <regexes>: Comma-separated regex patterns for directories to exclude
  • --exclude-files-regex <regexes>: Comma-separated regex patterns for files to exclude
  • --show-filters: Show active filters and exit
  • --output-format <format>: Output format for duplicate groups (default: pretty, options: pretty, json, yaml)
  • --output-file <file>: Write output to a file instead of stdout

For more details, run:

doppel find --help
# or
doppel find help

Examples:

Find duplicates in ~/Downloads and ~/Documents, excluding .git directories and files smaller than 1MB:

doppel find ~/Downloads ~/Documents --exclude-dirs=.git --min-size=1000000 --verbose
# or
doppel find ~/Downloads ~/Documents --exclude-dirs=.git --min-size=1MB --verbose

--min-size and --max-size support the following formats:

  • Bytes: 100, 100B, 100b are all equivalent
  • Kilobytes: 10KB, 10kB, 10Kb, 10kb, 10000 are all equivalent
  • Kibibytes: 10KiB, 10kiB, 10KIB, 10240 are all equivalent
  • Megabytes: 1MB, 1mB, 1Mb, 1mb, 1000000 are all equivalent
  • Mebibytes: 1MiB, 1miB, 1MIB. (same as 1048576)
  • Gigabytes: 1GB, 1gB, 1Gb, 1gb. (1000000000)
  • Gibibytes: 1GiB, 1giB, 1gIB. (1073741824)
  • Terabytes: 1TB, 1tB, 1Tb, 1tb. (1000000000000)
  • Tebibytes: 1TiB, 1tiB, 1TIb. (1099511627776)
  • Petabytes: 1PB, 1pB, 1Pb, 1pb. (1000000000000000)
  • Pebibytes: 1PiB, 1piB, 1PIB. (1125899906842624)
  • Exabytes: 1EB, 1eB, 1Eb, 1eb. (1000000000000000000)
  • Exbibytes: 1EiB, 1eiB, 1EIB. (1152921504606846976)

Find duplicates in /var/logs, excluding all .log files and directories starting with temp, and ignoring empty files:

doppel find /var/logs --min-size=1 --exclude-files="*.log" --exclude-dirs="temp*" # Be sure to quote patterns!

Note

When using glob patterns and regexes, be sure to quote (and escape, if necessary) them to prevent shell expansion.

๐ŸŽ›๏ธ Preset Command

Use presets for common duplicate-hunting scenarios:

  • dev: Skip development directories and files (e.g., build, temp, version control)
  • media: Focus on media files (images/videos), skip small files
  • docs: Focus on document files
  • clean: Skip temporary and cache files

Usage:

doppel preset <preset> [options]

Where <preset> is one of: dev, media, docs, or clean.

Preset options are the same as for find.

Example:

Find duplicate media files in your ~/Pictures folder:

doppel preset media ~/Pictures

๐Ÿงฌ How It Works

  1. File Discovery: Recursively scans specified directories (and their subdirectories), applying filters.
  2. Grouping: Groups files by size to quickly eliminate non-duplicates.
  3. Hashing: Computes Blake3 hashes for files with matching sizes.
  4. Reporting: Displays groups of duplicate files and optional statistics.

๐Ÿ—๏ธ Development

  • ๐Ÿ“ Code is organized in cmd/, internal/, and assets/ directories.

  • ๐Ÿงฉ Uses urfave/cli/v3 for CLI parsing.

  • ๐Ÿ”‘ Uses blake3 for fast hashing.

  • ๐Ÿงช Run tests with:

    go test -race -v ./...

๐Ÿ“œ License

This project is licensed under the MIT License. See LICENSE for details.

๐Ÿค Contributing

Contributions, issues, and feature requests are welcome! Please open an issue or pull request on GitHub.

See CONTRIBUTING.md for guidelines.


doppel โ€” Find your duplicate files, fast and reliably. โœจ

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages