1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
# embeddings-sort
This program can sort images such that ones with similar motives are close together. This is accomplished by using [AI](https://github.com/minimaxir/imgbeddings) to extract the meaning of the image, and then approximating a travelling-salesperson-tour through all of them.
As a bonus feature, this program can also sort the images by hue, brightness or color, though the results for this could be improved by using a less generalized algorithm.
The sorting can be accessed by letting the progam print the image paths in order, or by copying/symlinking the images into a new directory.
Detailed usage:
```
embeddings-sort [OPTIONS] [IMAGES]...
Arguments:
[IMAGES]...
Options:
-e, --embedder <EMBEDDER> Characteristic to sort by [default: content] [possible values: brightness, hue, color, content]
-s, --symlink-dir <SYMLINK_DIR> Symlink the sorted images into this directory
-o, --copy-dir <COPY_DIR> Copy the sorted images into this directory. Uses COW when available
-c, --stdout Write sorted paths into stdout, one per line
-0, --stdout0 Write sorted paths into stdout, null-separated. Overrides -c
-h, --help Print help
```
## Insides
The TSP approximation is done by using Prim's algorithm and doing a DFS through the resulting MST, giving a 2-approximation, which gives ok-ish results, but could be improved by using something like Christofides algorithm and doing attempts at improving the initial approximation. This is O(n²) time, however even for 10k images this should still be much quicker than the embedding step. The embeddings are therefore cached, usually in `$HOME/.cache/embeddings-sort`.
|