aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLia Lenckowski <lialenck@protonmail.com>2023-09-10 16:42:48 +0200
committerLia Lenckowski <lialenck@protonmail.com>2023-09-10 16:42:48 +0200
commit970e44a28a57c6299ddde7a107db7a024f901fd7 (patch)
treedd44ff5b2d852e4f499f49eb206035d3b0006d73
parent2590cda29da4c4a0dd05a6f91271c08c0335497d (diff)
downloadembeddings-sort-970e44a28a57c6299ddde7a107db7a024f901fd7.tar
embeddings-sort-970e44a28a57c6299ddde7a107db7a024f901fd7.tar.bz2
embeddings-sort-970e44a28a57c6299ddde7a107db7a024f901fd7.tar.zst
update README.md
-rw-r--r--README.md25
1 files changed, 24 insertions, 1 deletions
diff --git a/README.md b/README.md
index 886f3ef..fd38eab 100644
--- a/README.md
+++ b/README.md
@@ -1,3 +1,26 @@
# embeddings-sort
-Sort your images according to AI embeddings, or something. Still a bit WIP, but can already sort a list of images by their hue or brightness
+This program can sort images such that ones with similar motives are close together. This is accomplished by using [AI](https://github.com/minimaxir/imgbeddings) to extract the meaning of the image, and then approximating a travelling-salesperson-tour through all of them.
+As a bonus feature, this program can also sort the images by hue, brightness or color, though the results for this could be improved by using a less generalized algorithm.
+
+The sorting can be accessed by letting the progam print the image paths in order, or by copying/symlinking the images into a new directory.
+
+Detailed usage:
+
+```
+embeddings-sort [OPTIONS] [IMAGES]...
+
+Arguments:
+ [IMAGES]...
+
+Options:
+ -e, --embedder <EMBEDDER> Characteristic to sort by [default: content] [possible values: brightness, hue, color, content]
+ -s, --symlink-dir <SYMLINK_DIR> Symlink the sorted images into this directory
+ -o, --copy-dir <COPY_DIR> Copy the sorted images into this directory. Uses COW when available
+ -c, --stdout Write sorted paths into stdout, one per line
+ -0, --stdout0 Write sorted paths into stdout, null-separated. Overrides -c
+ -h, --help Print help
+```
+
+## Insides
+The TSP approximation is done by using Prim's algorithm and doing a DFS through the resulting MST, giving a 2-approximation, which gives ok-ish results, but could be improved by using something like Christofides algorithm and doing attempts at improving the initial approximation. This is O(n²) time, however even for 10k images this should still be much quicker than the embedding step. The embeddings are therefore cached, usually in `$HOME/.cache/embeddings-sort`.