Docker container for dupeGuru
This is a Docker container for dupeGuru.
The GUI of the application is accessed through a modern web browser (no installation or configuration needed on client side) or via any VNC client.
dupeGuru is a tool to find duplicate files on your computer. It can scan either
filenames or contents. The filename scan features a fuzzy matching algorithm
that can find duplicate filenames even when they are not exactly the same.
Launch the dupeGuru docker container with the following command:
docker run -d --rm \ --name=dupeguru \ -p 5800:5800 \ -p 5900:5900 \ -v /docker/appdata/dupeguru:/config:rw \ -v $HOME:/storage:rw \ jlesage/dupeguru
/docker/appdata/dupeguru: This is where the application stores its configuration, log and any files needing persistency.
$HOME: This location contains files from your host that need to be accessible by the application.
http://your-host-ip:5800 to access the dupeGuru GUI. Files from
the host appear under the
/storage folder in the container.
docker run [-d] [--rm] \ --name=dupeguru \ [-e <VARIABLE_NAME>=<VALUE>]... \ [-v <HOST_DIR>:<CONTAINER_DIR>[:PERMISSIONS]]... \ [-p <HOST_PORT>:<CONTAINER_PORT>]... \ jlesage/dupeguru
|-d||Run the container in background. If not set, the container runs in foreground.|
|--rm||Automatically remove the container when it exits.|
|-e||Pass an environment variable to the container. See the Environment Variables section for more details.|
|-v||Set a volume mapping (allows to share a folder/file between the host and the container). See the Data Volumes section for more details.|
|-p||Set a network port mapping (exposes an internal container port to the host). See the Ports section for more details.|
To customize some properties of the container, the following environment
variables can be passed via the
-e parameter (one for each variable). Value
of this parameter has the format
||ID of the user the application runs as. See User/Group IDs to better understand when this should be set.||
||ID of the group the application runs as. See User/Group IDs to better understand when this should be set.||
||Comma-separated list of supplementary group IDs of the application.||(unset)|
||Mask that controls how file permissions are set for newly created files. The value of the mask is in octal notation. By default, this variable is not set and the default umask of
||TimeZone of the container. Timezone can also be set by mapping
||When set to
||Priority at which the application should run. A niceness value of -20 is the highest priority and 19 is the lowest priority. By default, niceness is not set, meaning that the default niceness of 0 is used. NOTE: A negative niceness (priority increase) requires additional permissions. In this case, the container should be run with the docker option
||When set to
||Width (in pixels) of the application's window.||
||Height (in pixels) of the application's window.||
||Password needed to connect to the application's GUI. See the VNC Pasword section for more details.||(unset)|
||Extra options to pass to the x11vnc server running in the Docker container. WARNING: For advanced users. Do not use unless you know what you are doing.||(unset)|
The following table describes data volumes used by the container. The mappings
are set via the
-v parameter. Each mapping is specified with the following
||rw||This is where the application stores its configuration, log and any files needing persistency.|
||rw||This location contains files from your host that need to be accessible by the application.|
||rw||This is where duplicated files are moved when they are sent to trash.|
Here is the list of ports used by the container. They can be mapped to the host
-p parameter (one per port mapping). Each mapping is defined in the
<HOST_PORT>:<CONTAINER_PORT>. The port number inside the
container cannot be changed, but you are free to use any port on the host side.
|Port||Mapping to host||Description|
|5800||Mandatory||Port used to access the application's GUI via the web interface.|
|5900||Mandatory||Port used to access the application's GUI via the VNC protocol.|
When using data volumes (
-v flags), permissions issues can occur between the
host and the container. For example, the user within the container may not
exists on the host. This could prevent the host from properly accessing files
and folders on the shared volume.
To avoid any problem, you can specify the user the application should run as.
This is done by passing the user ID and group ID to the container via the
GROUP_ID environment variables.
To find the right IDs to use, issue the following command on the host, with the
user owning the data volume on the host:
Which gives an output like this one:
uid=1000(myuser) gid=1000(myuser) groups=1000(myuser),4(adm),24(cdrom),27(sudo),46(plugdev),113(lpadmin)
The value of
uid (user ID) and
gid (group ID) are the ones that you should
be given the container.
Accessing the GUI
Assuming the host is mapped to the same ports as the container, the graphical
interface of the application can be accessed via:
A web browser:
http://<HOST IP ADDR>:5800
Any VNC client:
<HOST IP ADDR>:5900
If different ports are mapped to the host, make sure they respect the
VNC_PORT = HTTP_PORT + 100
This is to make sure accessing the GUI with a web browser can be done without
specifying the VNC port manually. If this is not possible, then specify
explicitly the VNC port like this:
http://<HOST IP ADDR>:5800/?port=<VNC PORT>
To restrict access to your application, a password can be specified. This can
be done via two methods:
- By using the
- By creating a
.vncpass_clearfile at the root of the
This file should contains the password (in clear). During the container
startup, content of the file is obfuscated and renamed to
NOTE: This is a very basic way to restrict access to the application and it
should not be considered as secure in any way.
dupeGuru Deletion Options
When deleting duplicated files, dupeGuru offer two choices:
- Send files to trash
- Delete files directly
The first option moves files to the
/trash directory inside the container.
This operation can be slow for large files since it may imply a copy of the
data before the actual deletion.
There is also an option to link deleted files. It is not recommended to enable
this option, since there is a good chance that created links won't make sense
outside the container.