From 94185d8b39af5a96a61ffb4df7a2d76dcd7afa49 Mon Sep 17 00:00:00 2001 From: Lia Lenckowski Date: Sat, 17 Feb 2024 21:37:01 +0100 Subject: update README with new approx algorithm and metrics --- README.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index fd38eab..29997ae 100644 --- a/README.md +++ b/README.md @@ -8,19 +8,23 @@ The sorting can be accessed by letting the progam print the image paths in order Detailed usage: ``` -embeddings-sort [OPTIONS] [IMAGES]... +Usage: embeddings-sort [OPTIONS] [IMAGES]... Arguments: [IMAGES]... Options: - -e, --embedder Characteristic to sort by [default: content] [possible values: brightness, hue, color, content] + -e, --embedder Characteristic to sort by [default: content-euclidean] [possible values: brightness, hue, color, content-euclidean, content-angular-distance, content-manhatten] -s, --symlink-dir Symlink the sorted images into this directory -o, --copy-dir Copy the sorted images into this directory. Uses COW when available -c, --stdout Write sorted paths into stdout, one per line -0, --stdout0 Write sorted paths into stdout, null-separated. Overrides -c + -b, --benchmark Output total tour length to stderr + --tsp-approx Algorithm for TSP approximation. Leave as default if unsure [default: christofides] [possible values: mst-dfs, christofides, christofides-refined] -h, --help Print help ``` ## Insides -The TSP approximation is done by using Prim's algorithm and doing a DFS through the resulting MST, giving a 2-approximation, which gives ok-ish results, but could be improved by using something like Christofides algorithm and doing attempts at improving the initial approximation. This is O(n²) time, however even for 10k images this should still be much quicker than the embedding step. The embeddings are therefore cached, usually in `$HOME/.cache/embeddings-sort`. +The chrisofides implementation uses an approximated min-weight matching algorithm, which may be non-ideal, though I haven't benchmarked how much of a difference it makes (mainly due to the implementation complexity of an exact algorithm, which would also increase the implementations complexity from O(n²) to O(n³) where n is the number of given images). + +christofides-refined is planned to be christofides but with an O(n²) 2-opt-swapping step added after the main algorithm. Implementing this efficiently will also require some algorithmic trickery, so it's not ready yet. -- cgit v1.2.3-70-g09d2