Back in 2014 I posted my emacs setup for finding files globally. It acts like switch-to-buffer, if all the files on my system were already open in buffers. I can type in a substring of any filename I'm likely to work on and switch to that buffer. Since them, helm has changed its interface, and I've updated my code, adding a few extra quality-of-life features.

I use the same hourly cron job to take inventory of all files I am likely to edit, putting recently modified files near the top, as they are most likely to be the best match. I save this list to ~/.global-file-list.txt.

What's changed is how I set up helm. In 2014 I was using helm-recentf and temporarily binding the recentf file list to my own list. This was a quick & dirty way to do it, and it worked, except when I opened a file, it didn't get put onto the recentf list, since I had let-bound that list to something else. Oops. I learned more helm (by reading the source code and John Kitchin's blog posts) and now construct my own helm source instead of abusing helm-recentf:

(defun amitp/helm-all-files ()
  "Global filename match, over all files I typically open"
  (interactive)
  (helm
   :sources '(amitp/helm-source-my-files helm-source-locate)
   :buffer "*helm all files*"))

Helm will look at this source to get the filename list:

(defvar amitp/helm-source-my-files
  (helm-build-sync-source "My files"
    :candidates #'amitp/helm-global-file-list
    :filtered-candidate-transformer #'amitp/helm-filter-my-files
    :keymap helm-generic-files-map
    :action 'helm-type-file-actions))

This source tells helm to call amitp/helm-global-file-list. This function returns the most recently modified files at the top, then the currently open files, then files in the current folder, then recently opened files, then the global file list. (I'm not happy with this order and am still experimenting.) Some filenames will be in more than one list so I eliminate duplicates.

(defun amitp/helm-global-file-list ()
  "Files to list in amitp/helm-all-files"
  ;; delete-dups much faster than cl-remove-duplicates
  (delete-dups
   (mapcar 'abbreviate-file-name
           (append
            (read-file-into-lines "~/.recent-file-list.txt")
            (amitp/buffer-file-names)
            (helm-skip-boring-files
             (directory-files default-directory t))
            recentf-list
            amitp/global-file-list))))

I also use a filter to remove some filenames from the list. Why? Normally I want to show only the “source” and not the “compiled” version of something. For example I want to show .el files but not .elc files. I want to show .c files but not .o files. I can hide these with helm-boring-file-regexp-list. However, for my web pages, some of them are directly written as .html files (*.html should be included) but others are written in org-mode or markdown or something else (*.html should be excluded). To decide whether a file ending in .html is source or not, I need to look at the other filenames in the list. I use a “filtered candidate transformer” (see the documentation for helm-source) to take the currently matching filenames and filter them further. If the filenames contain both $something.org and $something.html then I know that the html is not a source file, so I hide it.

(defun dominated-by-filename (regexp replacement filename candidates)
  "True if FILENAME with REGEXP replaced by REPLACEMENT is already in CANDIDATES"
  (let ((new-filename (replace-regexp-in-string regexp replacement filename)))
    (and (not (equal new-filename filename))
         (member new-filename candidates))))

(defun amitp/helm-filter-my-files (candidates _source)
  "Ignore a build target if a build source exists in the candidates"
  (cl-loop for filename in candidates
      unless
      (or
       (dominated-by-filename "\\.html$" ".org" filename candidates)
       (dominated-by-filename "\\.html$" ".md" filename candidates))
      ;; (I have more rules but you get the idea)
      collect filename))

I don't currently colorize the output. I've considered coloring by type (source code, prose, build file, etc.) and by origin (recentf, current folder, open buffer, global list) but neither of these seems particularly appealing. I'll continue to experiment.

The last change since 2014 is that my global file list doesn't have everything, and it is annoyingly missing any new files I've created in the past hour. In 2014 I set up C-l to switch to helm-locate so that if I was unable to find what I wanted, I could have it search more. However, I never remembered to use it.

I now have two ways to solve that problem. First, I augment my hourly cron job with a quick cron job that runs every minute, and reports back any files edited recently. I had started with any files added recently but realized if I changed it to files edited recently, I could put those files at the top of my list. I run this every minute:

mdfind -onlyin $HOME "kMDItemFSContentChangeDate > \$time.now(-7200)
   && kMDItemContentTypeTree = 'public.text'" >$HOME/.recent-file-list.txt

In the helm source, I read ~/.recent-file-list.txt (a very short list) and put those items ahead of others.

The second solution is to use helm-source-locate as a source in the main interface, even though the UI is incompatible (for regular results you can use space to separate words, but for locate you can't). It will show the locate results below the main results. For times when my global file list and my recent file list don't show anything, locate might find it.

(setq helm-locate-command "/Users/amitp/bin/locate %.0s %s")

I use Mac's mdfind (which is updated in real time) to find the base filename and then I filter that through grep to handle folders. For example, if I run locate foo/bar then I use mdfind to find files named bar, then in the results I grep for foo/bar. However if I run locate ar without a folder name then I should find bar as a substring match. The shell script isn't perfect but it's “good enough” for now.

#!/bin/bash
rawquery="$*"
suffix=$(basename -- "$rawquery")
pattern="${suffix}*"

if [ "$rawquery" = "$suffix" ]; then
    # If there's no folder then it should be a substring
    pattern="*${suffix}*"
    # NOTE: double wildcards are slower!
fi

mdfind -onlyin $HOME "kMDItemFSName = '$pattern'cd" \
    | sed -e "s:^$HOME:~:" \
    | fgrep -i -- "$rawquery" \
    | postprocess

So far I've not needed the locate results, so I might end up removing that part of the code. I think the minutely cron job might be all I need.

I'm much happier with my setup now compared to the original, but there's always more tweaking to do. It's Emacs after all!

Labels: ,

0 comments: