Install and configure hunspell

Imagine this: For intricate reasons, you have decided to get your Emacs setup working on Windows as well, although you have a perfectly fine and working WSL2 configuration.

You’re surprised by how well this goes (winget install GNU.Emacs FTW!), until you decide to setup the hunspell spell checker…

It starts pretty well, when you are able to install hunspell with a simple winget install FSFhu.Hunspell, after which you download a set of English dictionaries from the LibreOffice extension, and then set your DICPATH environment variable to point to the directory containing all of the unpacked .aff and .dic files.

In init.el, you setup the default dictionary, and the full path to the hunspell binary:

1
2
(setenv "DICTIONARY" "en_GB")
(setq ispell-program-name "C:/Users/cpbot/AppData/Local/Microsoft/WinGet/Links/hunspell.exe")

See Emacs reject this

After this, you invoke exactly the same hunspell command that Emacs does as part of (defun ispell-find-hunspell-dictionaries (&optional dictionary)):

1
2
3
4
5
6
7
8
> hunspell -D -d en_GB NUL
SEARCH PATH:
.;;C:\Users\cpbot\OneDrive\Documents\hunspell\;C:\Hunspell\;%USERPROFILE%\Application Data\OpenOffice.org 2\user\wordbook;C:\Program files\OpenOffice.org 2.4\share\dict\ooo\;C:\Program files\OpenOffice.org 2.3\share\dict\ooo\;C:\Program files\OpenOffice.org 2.2\share\dict\ooo\;C:\Program files\OpenOffice.org 2.1\share\dict\ooo\;C:\Program files\OpenOffice.org 2.0\share\dict\ooo\
AVAILABLE DICTIONARIES (path is not mandatory for -d option):
LOADED DICTIONARY:
C:\Users\cpbot\OneDrive\Documents\hunspell\\en_GB.aff
C:\Users\cpbot\OneDrive\Documents\hunspell\\en_GB.dic
Hunspell has been compiled without Ncurses user interface.

Note that hunspell outputs the loaded dictionaries (with extensions), but it does not show the available dictionaries.

Thus, as soon as you try for example M-x ispell-region, you are met with the following error message:

1
ispell-phaf: No matching entry for en_GB in ‘ispell-hunspell-dict-paths-alist’.

Implement work-around

Looking at the following section of the ispell-find-hunspell-dictionaries function, you see (actually: after hours of debugging, you see) that it relies implicitly on the AVAILABLE DICTIONARIES section which usually lists only the dictionary basenames (i.e. without extensions).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
(dolist (dict hunspell-found-dicts)
  (let* ((full-name (file-name-nondirectory dict))
         (basename  (file-name-sans-extension full-name))
         (affix-file (concat dict ".aff")))
    (if (string-match "\\.aff$" dict)
        ;; Found default dictionary
        (progn
          (if hunspell-default-dict
              (setq hunspell-multi-dict
                    (concat (or hunspell-multi-dict
                                (car hunspell-default-dict))
                            "," basename))
            (setq affix-file dict)
            ;; FIXME: The cdr of the list we cons below is never
            ;; used.  Why do we need a list?
            (setq hunspell-default-dict (list basename affix-file)))
          (ispell-print-if-debug
           "++ ispell-fhd: default dict-entry:%s name:%s basename:%s\n"
           dict full-name basename))
      (if (and (not (assoc basename ispell-hunspell-dict-paths-alist))
               (file-exists-p affix-file))
          ;; Entry has an associated .aff file and no previous value.
          (let ((affix-file (expand-file-name affix-file)))
            (ispell-print-if-debug
             "++ ispell-fhd: dict-entry:%s name:%s basename:%s affix-file:%s\n"
             dict full-name basename affix-file)
            (cl-pushnew (list basename affix-file)
                        ispell-hunspell-dict-paths-alist :test #'equal))
        (ispell-print-if-debug
         "-- ispell-fhd: Skipping entry: %s\n" dict)))))

Before I realized the root cause, I hacked the code as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
diff -u "c:/Program Files/Emacs/emacs-29.1/share/emacs/29.1/lisp/textmodes/ispell.el~" "c:/Program Files/Emacs/emacs-29.1/share/emacs/29.1/lisp/textmodes/ispell.el"
--- "c:/Program Files/Emacs/emacs-29.1/share/emacs/29.1/lisp/textmodes/ispell.el~"	2023-11-14 09:26:02.059504800 +0200
+++ "c:/Program Files/Emacs/emacs-29.1/share/emacs/29.1/lisp/textmodes/ispell.el"	2023-11-14 09:21:16.309633100 +0200
@@ -1113,7 +1113,7 @@
     (dolist (dict hunspell-found-dicts)
       (let* ((full-name (file-name-nondirectory dict))
             (basename  (file-name-sans-extension full-name))
-	     (affix-file (concat dict ".aff")))
+	     (affix-file (concat (file-name-sans-extension dict) ".aff")))
        (if (string-match "\\.aff$" dict)
            ;; Found default dictionary
            (progn

This change, which should not affect the normal happy code path, is robust to hunspell’s incomplete output in this case, which you should be able to confirm if you trace the code.

In short, when it runs into an .aff file in the combined list (available + loaded), it selects that as default (loaded), which is the old behaviour. However, in the case of a .dic, it replaces that with .aff (new behaviour) and then strikes the else sexp of the if, exactly like it would have if that basename had been listed under available.

Conclusion

After all of that, and also realizing that a long-forgotten setting in my init.el was breaking sub-process comms on Windows (this also took more time debugging than I would like to admit):

1
2
3
4
;; WARNING WARNING WARNING this utterly broke comms with hunspell and aspell on
;; Windows: they would just block forever during a successfully started
;; ispell-region.
;;(setq inhibit-eol-conversion t)

… I was overjoyed at a working Emacs 29 hunspell 1.7 setup on Windows, and then, armed with far more precise search terms, ran into this open hunspell bug.