FIX: Don't blow up when trying to parse invalid or non-ASCII URLs (#9838)

FIX: Don’t blow up when trying to parse invalid or non-ASCII URLs (#9838)

  • FIX: Don’t blow up when trying to parseinvalid or non-ASCII URLs

Follow-up to https://github.com/discourse/discourse/commit/72f139191e9b3aeebeca31ffa8b75977afe5d816

diff --git a/lib/file_store/s3_store.rb b/lib/file_store/s3_store.rb
index 46dece4..f42fa73 100644
--- a/lib/file_store/s3_store.rb
+++ b/lib/file_store/s3_store.rb
@@ -79,7 +79,12 @@ module FileStore
     def has_been_uploaded?(url)
       return false if url.blank?
 
-      parsed_url = URI.parse(url)
+      begin
+        parsed_url = URI.parse(URI.encode(url))
+      rescue URI::InvalidURIError
+        return false
+      end
+
       base_hostname = URI.parse(absolute_base_url).hostname
       if url[base_hostname]
         # if the hostnames match it means the upload is in the same
diff --git a/spec/components/file_store/s3_store_spec.rb b/spec/components/file_store/s3_store_spec.rb
index 81e39d0..15c36b0 100644
--- a/spec/components/file_store/s3_store_spec.rb
+++ b/spec/components/file_store/s3_store_spec.rb
@@ -304,6 +304,15 @@ describe FileStore::S3Store do
 
   describe ".has_been_uploaded?" do
 
+    it "doesn't crash for invalid URLs" do
+      expect(store.has_been_uploaded?("https://site.discourse.com/#bad#6")).to eq(false)
+    end
+
+    it "doesn't crash if URL contains non-ascii characters" do
+      expect(store.has_been_uploaded?("//s3-upload-bucket.s3.dualstack.us-east-1.amazonaws.com/漢1337.png")).to eq(true)
+      expect(store.has_been_uploaded?("//s3-upload-bucket.s3.amazonaws.com/漢1337.png")).to eq(false)
+    end
+
     it "identifies S3 uploads" do
       expect(store.has_been_uploaded?("//s3-upload-bucket.s3.dualstack.us-east-1.amazonaws.com/1337.png")).to eq(true)
     end

GitHub sha: 02f44def

This commit appears in #9838 which was merged by OsamaSayegh.

@martin, could you take a look at this?