utf-8 characters shouldn't be represented as HTML entities in RSS
Describe the bug
Non-ansi characters (such as é, à) in the name of an account are outputted as HTML entities in the RSS, but they should be outputted as UTF-8 characters.
Steps to reproduce
- Create a podcast with the title "Les chemins de l'écologie"
- Subscribe to the RSS. (Tested with thunderbird, rhythmbox, and opened the raw xml)
Expected behavior
The title of the stream should be "Les chemins de l'écologie"
Actual behavior
The title of the stream appears as "Les chemins de l'écologie"
Relevant logs and/or screenshots
In the web interface, the title is outputted with UTF-8 characters:
But in the RSS stream, it is outputted with HTML entities:
<rss version="2.0">
<channel>
<atom:link href="https://podcast.picasoft.net/@chemin_ecologie/feed.xml" rel="self" type="application/rss+xml"/>
<lastBuildDate>Tue, 08 Jun 2021 08:31:43 +0000</lastBuildDate>
<generator>Castopod Host - https://castopod.org/</generator>
<docs>https://cyber.harvard.edu/rss/rss.html</docs>
<title>Les chemins de l'écologie</title>
<description>
<p>Cette série de podcasts développe le programme écologique des grands candidats à la présidentielle de 2022 et est à l’initiative des associations UTCéennes Profit’rôles et Convergence ;)</p>
However, the two clients that I have tested do not support them.
The first one is thunderbird, a widely used e-mail, newsgroup and RSS client. As you can see, I am subscribed to other feeds containing non-ANSI characters in their title that are displayed correctly.
The second one is rhythmbox, an audio player distributed with gnome, present by default on many GNU/linux distributions.
Context
- Castopod:
- back: castopod host v2.0.0-alpha.57
- front: castopod host v1.0.0-alpha.57 (for static content)
- OS, server:
- back: linux + docker php:7.4-fpm-alpine3.13
- front: linux + docker nginx:1.19-alpine
- Browser: Firefox, Thundebird, Rhythmbox
Possible fixes
Do not use HTML entities in RSS feed. I couldn't find the view responsible for generating RSS, but it seems that the function htmlspecialchars
is applied to the description of the episodes, and non-ansi characters in episode descriptions are displayed correctly in my feed readers.