Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Make UTF-8 the default encoding for XML feeds
Consider the feed http://planet.haskell.org/atom.xml - This is a UTF-8 encoded XML file - No encoding declaration in the XML header - No Unicode byte order mark - Served with HTTP Content-Type "text/xml" (no charset parameter) Miniflux lets charset.NewReader handle this. The charset package implements the HTML5 character encoding algorithm, which, in this situation, defaults to windows-1252 encoding if there are no UTF-8 characters in the first 1000 bytes. So for this feed, we get the wrong encoding. I inserted an explicit "utf8.Valid()" check, which fixes this problem.
- Loading branch information