diff options
author | Lia Lenckowski <lialenck@protonmail.com> | 2024-02-17 21:37:01 +0100 |
---|---|---|
committer | Lia Lenckowski <lialenck@protonmail.com> | 2024-02-17 21:37:01 +0100 |
commit | 94185d8b39af5a96a61ffb4df7a2d76dcd7afa49 (patch) | |
tree | fec9433815d1e764c1b12839a2f35e9aa861757e /README.md | |
parent | c65db2b7bf16f1e1b4f35cf796d3d1d78c1a9332 (diff) | |
download | embeddings-sort-94185d8b39af5a96a61ffb4df7a2d76dcd7afa49.tar embeddings-sort-94185d8b39af5a96a61ffb4df7a2d76dcd7afa49.tar.bz2 embeddings-sort-94185d8b39af5a96a61ffb4df7a2d76dcd7afa49.tar.zst |
update README with new approx algorithm and metrics
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 10 |
1 files changed, 7 insertions, 3 deletions
@@ -8,19 +8,23 @@ The sorting can be accessed by letting the progam print the image paths in order Detailed usage: ``` -embeddings-sort [OPTIONS] [IMAGES]... +Usage: embeddings-sort [OPTIONS] [IMAGES]... Arguments: [IMAGES]... Options: - -e, --embedder <EMBEDDER> Characteristic to sort by [default: content] [possible values: brightness, hue, color, content] + -e, --embedder <EMBEDDER> Characteristic to sort by [default: content-euclidean] [possible values: brightness, hue, color, content-euclidean, content-angular-distance, content-manhatten] -s, --symlink-dir <SYMLINK_DIR> Symlink the sorted images into this directory -o, --copy-dir <COPY_DIR> Copy the sorted images into this directory. Uses COW when available -c, --stdout Write sorted paths into stdout, one per line -0, --stdout0 Write sorted paths into stdout, null-separated. Overrides -c + -b, --benchmark Output total tour length to stderr + --tsp-approx <TSP_APPROX> Algorithm for TSP approximation. Leave as default if unsure [default: christofides] [possible values: mst-dfs, christofides, christofides-refined] -h, --help Print help ``` ## Insides -The TSP approximation is done by using Prim's algorithm and doing a DFS through the resulting MST, giving a 2-approximation, which gives ok-ish results, but could be improved by using something like Christofides algorithm and doing attempts at improving the initial approximation. This is O(n²) time, however even for 10k images this should still be much quicker than the embedding step. The embeddings are therefore cached, usually in `$HOME/.cache/embeddings-sort`. +The chrisofides implementation uses an approximated min-weight matching algorithm, which may be non-ideal, though I haven't benchmarked how much of a difference it makes (mainly due to the implementation complexity of an exact algorithm, which would also increase the implementations complexity from O(n²) to O(n³) where n is the number of given images). + +christofides-refined is planned to be christofides but with an O(n²) 2-opt-swapping step added after the main algorithm. Implementing this efficiently will also require some algorithmic trickery, so it's not ready yet. |