Fix/wikipedia oneboxing with url encoded hash (PR #14015)

From the report Wikipedia oneboxing with url encoded section hash

Sections with unreserverd characters will appear url-encoded and need to be unescaped before using it.

Wikipedia generates 2 different spans in this case in the same page, one with an id resulting of replacing the % symbols with . and the other with the decoded version of the string. For example, for /wiki/foo#A%C3%A1A it will generate:

<span id="A.C3.A1A"></span>
<span id="AáA">AáA</span>

Unescaping the m_url_hash_name should work in all cases to target the proper section span.

This PR includes:

  • 1 new test for a url with hash section using the old fixture that was not previously covered
  • 1 new test for a url with a percent-encoded hash section and a new fixture using the reporter’s example
  • Possible fix

First time here… giving Ruby and Rails a try and not entirely sure about conventions (code and commit-wise) in the project, so please, let me know if this needs amending in any way!



This pull request has been mentioned on Discourse Meta. There might be relevant details there:

Thanks, this looks good!