HUGO

  • News
  • Docs
  • Themes
  • Community
  • GitHub
Star

What's on this Page

    • Trouble: Categories with accented characters
    • Solution
      • Discussion Forum References
TROUBLESHOOTING

Accented Characters in URLs

If you’re having trouble with special characters in your taxonomies or titles adding odd characters to your URLs.

Trouble: Categories with accented characters

One of my categories is named “Le-carré,” but the link ends up being generated like this:

categories/le-carr%C3%A9

And not working. Is there an easy fix for this that I’m overlooking?

Solution

Are you a macOS user? If so, you are likely a victim of HFS Plus file system’s insistence to store the “é” (U+00E9) character in Normal Form Decomposed (NFD) mode, i.e. as “e” + “ ́” (U+0065 U+0301).

le-carr%C3%A9 is actually correct, %C3%A9 being the UTF-8 version of U+00E9 as expected by the web server. The problem is that OS X turns [U+00E9] into [U+0065 U+0301], and thus le-carr%C3%A9 no longer works. Instead, only le-carre%CC%81 ending with e%CC%81 would match that [U+0065 U+0301] at the end.

This is unique to OS X. The rest of the world does not do this, and most certainly not your web server which is most likely running Linux. This is not a Hugo-specific problem either. Other people have been bitten by this when they have accented characters in their HTML files.

Note that this problem is not specific to Latin scripts. Japanese Mac users often run into the same issue; e.g., with だ decomposing into た and ゙. (Read the Japanese Perl users article).

Rsync 3.x to the rescue! From an answer posted on Server Fault:

You can use rsync’s --iconv option to convert between UTF-8 NFC & NFD, at least if you’re on a Mac. There is a special utf-8-mac character set that stands for UTF-8 NFD. So to copy files from your Mac to your web server, you’d need to run something like:

rsync -a --iconv=utf-8-mac,utf-8 localdir/ mywebserver:remotedir/

This will convert all the local filenames from UTF-8 NFD to UTF-8 NFC on the remote server. The files’ contents won’t be affected. - Server Fault

Please make sure you have the latest version of rsync 3.x installed. The rsync that ships with OS X is outdated. Even the version that comes packaged with 10.10 (Yosemite) is version 2.6.9 protocol version 29. The --iconv flag is new in rsync 3.x.

Discussion Forum References

  • http://discourse.gohugo.io/t/categories-with-accented-characters/505
  • http://wiki.apache.org/subversion/NonNormalizingUnicodeCompositionAwareness
  • https://en.wikipedia.org/wiki/Unicode_equivalence#Example
  • http://zaiste.net/2012/07/brand_new_rsync_for_osx/
  • https://gogo244.wordpress.com/2014/09/17/drived-me-crazy-convert-utf-8-mac-to-utf-8/

See Also

  • absLangURL
  • relLangURL
  • uniq
  • urls.Parse
  • Links and Cross References
  • About Hugo
    • Overview
    • Hugo Features
    • The Benefits of Static
    • Roadmap
    • License
  • Getting Started
    • Get Started Overview
    • Quick Start
    • Install Hugo
    • Basic Usage
    • Directory Structure
    • Configuration
  • Themes
    • Themes Overview
    • Install and Use Themes
    • Customize a Theme
    • Create a Theme
  • Content Management
    • Content Management Overview
    • Organization
    • Supported Content Formats
    • Front Matter
    • Shortcodes
    • Related Content
    • Sections
    • Types
    • Archetypes
    • Taxonomies
    • Summaries
    • Links and Cross References
    • URL Management
    • Menus
    • Table of Contents
    • Comments
    • Multilingual and i18n
    • Syntax Highlighting
  • Templates
    • Templates Overview
    • Introduction
    • Template Lookup Order
    • Custom Output Formats
    • Base Templates and Blocks
    • List Page Templates
    • Homepage Template
    • Section Templates
    • Taxonomy Templates
    • Single Page Templates
    • Content View Templates
    • Data Templates
    • Partial Templates
    • Shortcode Templates
    • Local File Templates
    • 404 Page
    • Menu Templates
    • Pagination
    • RSS Templates
    • Sitemap Template
    • Robots.txt
    • Internal Templates
    • Alternative Templating
    • Template Debugging
  • Functions
    • Functions Quick Reference
    • .AddDate
    • .Format
    • .Get
    • .GetPage
    • .Param
    • .Scratch
    • .Unix
    • Math
    • absLangURL
    • absURL
    • after
    • apply
    • base64
    • chomp
    • countrunes
    • countwords
    • dateFormat
    • default
    • delimit
    • dict
    • echoParam
    • emojify
    • eq
    • findRE
    • first
    • ge
    • getenv
    • gt
    • hasPrefix
    • highlight
    • htmlEscape
    • htmlUnescape
    • humanize
    • i18n
    • imageConfig
    • in
    • index
    • int
    • intersect
    • isset
    • jsonify
    • lang.NumFmt
    • last
    • le
    • lower
    • lt
    • markdownify
    • md5
    • ne
    • now
    • partialCached
    • plainify
    • pluralize
    • print
    • printf
    • println
    • querify
    • range
    • readDir
    • readFile
    • ref
    • relLangURL
    • relURL
    • relref
    • render
    • replace
    • replaceRE
    • safeCSS
    • safeHTML
    • safeHTMLAttr
    • safeJS
    • safeURL
    • seq
    • sha
    • shuffle
    • singularize
    • slice
    • slicestr
    • sort
    • split
    • string
    • strings.TrimLeft
    • strings.TrimPrefix
    • strings.TrimRight
    • strings.TrimSuffix
    • substr
    • time
    • title
    • trim
    • truncate
    • union
    • uniq
    • upper
    • urlize
    • urls.Parse
    • where
    • with
  • Variables
    • Variables Overview
    • Site Variables
    • Page Variables
    • Shortcode Variables
    • Taxonomy Variables
    • File Variables
    • Menu Variables
    • Hugo Variables
    • Git Variables
    • Sitemap Variables
  • CLI
  • Troubleshooting
    • Troubleshoot
    • Accented Characters in URLs
    • Build Performance
    • EOF Error
  • Tools
    • Developer Tools Overview
    • Migrations
    • Starter Kits
    • Frontends
    • Editor Plug-ins
    • Search
    • Other Projects
  • Hosting & Deployment
    • Hosting & Deployment Overview
    • Host-Agnostic Deploys with Nanobox
    • Host on Netlify
    • Host on Firebase
    • Host on GitHub
    • Host on GitLab
    • Host on Bitbucket
    • Deployment with Wercker
    • Deployment with Rsync
  • Contribute
    • Contribute to Hugo
    • Development
    • Documentation
    • Themes
“Accented Characters in URLs” was last updated: October 13, 2017: Initial commit (a48229f)
Improve this page
By the Hugo Authors

The Hugo logos are copyright © Steve Francia 2013–2017.

The Hugo Gopher is based on an original work by Renée French.

  • File an Issue
  • Get Help
  • Discuss the Source Code
  • @GoHugoIO
  • @spf13
  • @bepsays
  • News
  • Docs
  • Themes
  • Community
  • GitHub
  • About Hugo
  • Getting Started
  • Themes
  • Content Management
  • Templates
  • Functions
  • Variables
  • CLI
  • Troubleshooting
  • Tools
  • Hosting & Deployment
  • Contribute