FIX: Unescapes hash section with present to account for url-encoded chars

FIX: Unescapes hash section with present to account for url-encoded chars

Sections with unreserverd characters will appear url-encoded and need to be unescaped before using it.

Wikipedia generates 2 different spans in this case in the same page, one with an id resulting of replacing the % symbols with . and the other with the decoded version of the string. For example, for /wiki/foo#A%C3%A1A it will generate:

AáA

Unescaping the m_url_hash_name should work in all cases to target the proper section span.

diff --git a/lib/onebox/engine/wikipedia_onebox.rb b/lib/onebox/engine/wikipedia_onebox.rb
index e86a901..6d7d404 100644
--- a/lib/onebox/engine/wikipedia_onebox.rb
+++ b/lib/onebox/engine/wikipedia_onebox.rb
@@ -24,7 +24,7 @@ module Onebox
         end
 
         unless m_url_hash.nil?
-          section_header_title = raw.xpath("//span[@id='#{m_url_hash_name}']")
+          section_header_title = raw.xpath("//span[@id='#{CGI.unescape(m_url_hash_name)}']")
 
           if section_header_title.empty?
             paras = raw.search("p") # default get all the paras

GitHub sha: d27d7c8ccac49bd99ad54a73c8f16fe3e74b02a9

This commit appears in #14015 which was approved by eviltrout. It was merged by eviltrout.