Back in 2016 I posted my emacs setup for finding files globally. It unifies find-file and switch-to-buffer, and also lets me find files without switching folders. I've been improving it since then and wanted to post an update.

Find files globally

The biggest change is speed. I profiled the code and found two bottlenecks in the construction of the global file list. The list is roughly 15000 elements, so I precomputed as much of that as I could, and the startup time went down from 0.8 seconds to 0.1 second. It feels instant now.

There are three components to this system.

1 Make a list of files

From an hourly cronjob, I make a global list of filenames that I am likely to work with. This is around 15000 on my system. I sort them so that the newer ones are closer to the top:

(mdfind -onlyin $HOME "kMDItemContentTypeTree = 'public.text' && kMDItemFSContentChangeDate > \$time.today(-1)";\
 mdfind -onlyin $HOME "kMDItemContentTypeTree = 'public.text' && kMDItemFSContentChangeDate > \$time.today(-4) && kMDItemFSContentChangeDate <= \$time.today(-1)";\
 mdfind -onlyin $HOME "kMDItemContentTypeTree = 'public.text' && kMDItemFSContentChangeDate > \$time.today(-14) && kMDItemFSContentChangeDate <= \$time.today(-4)";\
 mdfind -onlyin $HOME "kMDItemContentTypeTree = 'public.text' && kMDItemFSContentChangeDate > \$time.today(-45) && kMDItemFSContentChangeDate <= \$time.today(-14)";\
 mdfind -onlyin $HOME "kMDItemContentTypeTree = 'public.text' && kMDItemFSContentChangeDate <= \$time.today(-45)"\
) >~/.global-file-list.txt

From a minutely cronjob, I get everything that I've touched in the last few hours. On my Mac with an SSD, it takes 0.1 seconds to run this:

mdfind -onlyin $HOME "kMDItemContentTypeTree = 'public.text' && kMDItemFSContentChangeDate > \$time.now(-7200)"\
> ~/.recent-file-list.txt

My actual queries are a bit more convoluted than this, because I also want some files that aren't classified as public.text. You'll have to modify these queries to match your needs.

I don't know how to make the recent-file-list query fast on Linux. Mac's mdfind is integrated into the file system and indexes in real time; Linux has find and locate but both take a lot longer.

2 Static file list

I have this helper function for reading a text file into a list of strings:

(defun read-file-into-lines (filename)
  "Read file, split into lines, return a list"
  (with-temp-buffer
    (condition-case nil
        (insert-file-contents filename)
      (file-error (message "Could not read file %s" filename)))
    (split-string (buffer-substring-no-properties (point-min) (point-max)) "\n" t)))

Periodically I read .global-file-list.txt into an Emacs list, sorting it so that the more useful files are closer to the top:

(defvar amitp/global-file-list '() "List of most files I work with.")
(defun amitp/update-global-file-list ()
  "Periodically update my global file list."
  (condition-case nil
      (let* ((new-file-list (read-file-into-lines "~/.global-file-list.txt")))
        (when new-file-list
          (setq amitp/global-file-list
                (amitp/reorder-file-list (mapcar 'abbreviate-file-name new-file-list)))
          ))
    (file-error (message "amitp/update-global-file-list: could not read global file list") nil)))

(amitp/update-global-file-list)
(defvar amitp/update-global-file-list-timer
  (run-with-idle-timer 1200 t #'amitp/update-global-file-list))

How do I sort? Right now it's mainly looking for generated files and pushing them to the bottom. I use a helper function that uses a regexp replacement to compute the generated filename:

(defun dominated-by-filename (regexp replacement filename candidates)
  "True if FILENAME with REGEXP replaced by REPLACEMENT is already in CANDIDATES"
  (let ((new-filename (replace-regexp-in-string regexp replacement filename)))
    (and (not (equal new-filename filename))
         (member new-filename candidates))))

And then anything that is generated gets put at the bottom of the list:

(defun amitp/reorder-file-list (file-list)
  "Reorder FILE-LIST so that undominated filenames come first"
  (let ((undominated nil)
        (dominated nil))
    (dolist (filename file-list (reverse (append dominated undominated)))
      (if (or
           (dominated-by-filename "\\.html$" ".org" filename file-list)
           (dominated-by-filename "\\.html$" ".md" filename file-list))
          (push filename dominated)
        (push filename undominated)))))

I could add some sort of priority system that prefers certain folders or filename extensions, but I haven't done that yet.

3 Dynamic file list

The static file list is useful but I want to merge other data sources into it:

  1. Most recent file from recentf
  2. Recently modified files (.recent-file-list.txt from above)
  3. Currently open buffers
  4. Files in the current directory
  5. Other recent files from recentf

I merge these and delete duplicates:

(defun amitp/all-files-list ()
  "Files to list in amitp/ivy-all-files and amitp/helm-all-files"
  ;; Use destructive delete-dups for speed
  (delete-dups
   (append
    (mapcar 'abbreviate-file-name
            (append
             (-take 1 recentf-list)
             (read-file-into-lines "~/.recent-file-list.txt")
             (amitp/buffer-file-names)
             (directory-files default-directory t)
             (-drop 1 recentf-list)))
    amitp/global-file-list
    nil ;;; need this because delete-dups modifies the list
    )))

I don't know that this order is the best; I need to continue experimenting.

4 Searching with Helm

To make this work with Helm I need to build a Helm “source”:

(defvar amitp/helm-source-my-files
      (helm-build-sync-source "My files"
        :candidates #'amitp/all-files-list
        :keymap helm-generic-files-map
        :action 'helm-type-file-actions))

I then call Helm with that source:

(setq helm-ff-transformer-show-only-basename t)
(defun amitp/helm-all-files ()
  "Global filename match, over all files I typically open"
  (interactive)
  (let ((helm-ff-transformer-show-only-basename nil))
    (helm
     :sources '(amitp/helm-source-my-files helm-source-locate)
     :buffer "*helm all files*")))
Helm find all files

Optional: use :filtered-candidate-transformer to filter out the unwanted filenames.

5 Searching with Ivy

I also made an ivy version. Ivy uses the minibuffer instead of a full buffer and feels lighter than Helm. To make Ivy match words in any order like I'm used to with Helm:

(setq ivy-re-builders-alist '((t . ivy--regex-ignore-order)))

Ivy's interface is a bit simpler than Helm's:

(defun amitp/ivy-all-files ()
  "Global filename match, over all files I typically open"
  (interactive)
  (ivy-read "All files: " (amitp/all-files-list)
            :caller 'amitp/ivy-all-files
            :require-match t
            :action #'find-file))
Ivy find all files

The customization for Ivy seems to happen in three ways:

  1. Pass in parameters to ivy-read
  2. Modify global variables like ivy-sort-functions-alist
  3. Call functions like ivy-set-display-transformer

For the latter two, I gave the ivy a name with :caller 'amitp/ivy-all-files, and then 'amitp/ivy-all-files is the key used in the customization lists. I'm new to Ivy and don't understand why there are three different systems for customization.

Here's how to re-sort the candidates on each keystroke:

(defun amitp/ivy-all-files-sort (_name candidates)
  "Re-sort ivy filename candidates to prefer non-dominated filenames"
  (if (< (length candidates) 1000)
      (amitp/reorder-file-list candidates)
    candidates))
(add-to-list 'ivy-sort-matches-functions-alist
              '(amitp/ivy-all-files . amitp/ivy-all-files-sort))

Here's how to display some items in a different face:

(defun amitp/ivy-all-files-transformer (filename)
  "Display filename differently if it's dominated"
  (if (or
       (dominated-by-filename "\\.html$" ".org" filename amitp/global-file-list)
       (dominated-by-filename "\\.html$" ".md" filename amitp/global-file-list))
      (propertize filename 'face 'dired-ignored)
    filename))
(ivy-set-display-transformer
 'amitp/ivy-all-files #'amitp/ivy-all-files-transformer)

6 Next steps

This is highly customized to my system so I don't think it'll directly be useful, but I hope you can take these pieces and use them for your own setup.

Things I'd like to do:

  1. Prioritize the current project's files in the sort order
  2. Highlight folders and current project files differently
  3. Improve the match order to prefer matches towards the end of the filename, so that bar matches /foo/bar.txt before /bar/foo.txt
  4. Collect data on which matches I actually select, so that I can adjust the sort order to better match what I want
  5. Figure out how to integrate locate results into Ivy, like I have with Helm. I tried the multiple-source approach from here but it didn't display the main list until the locate was finished, which added too much latency.

Labels: ,

1 comment:

Chloe wrote at Friday, February 16, 2018 at 6:32:00 PM PST

This very great information, Step bu steps, love it. More updates on it, if there's something to add. Thanks for sharing this a lot.