FEATURE: add slug geneartion options (PR #3370)

spec: Removing the concept of "slugs" for some languages - feature - Discourse Meta

GitHub

You’ve signed the CLA, fantasticfears. Thank you! This pull request is ready for review.

@fantasticfears thanks for your work

@Qasem-h I think I need some time to look at why fails the spec… may take some days

Rebased and fix all specs.

very happy to see this change, can you rebase again though? also screenshots of the UI affected please, and browser paths etc


Only 1 changes to the UI, site setting:

screenshot 2015-04-28 16 45 23

method: english

Topic

  • name: English title with Chinese sanitized and some special characters!@:?\:'#^& $%&*()`
  • url: http://localhost:3000/t/english-title-with-chinese-sanitized-and-some-special-characters/11
  • slug: english-title-with-chinese-sanitized-and-some-special-characters

screenshot 2015-04-28 16 52 18

Category

  • name: English Category with CJK slug 英文分类
  • input_slug: english slug 英文分类
  • url: http://localhost:3000/c/english-slug
  • slug: english-slug

screenshot 2015-04-28 16 53 26

method: none

Topic

  • name: ``English title with Chinese sanitized and some special characters!@:?:’#^& $%&*()
  • url: http://localhost:3000/t/topic/14
  • slug: topic

Category

  • name: NoneCategory
  • input_slug: none
  • url: http://localhost:3000/c/7-category
  • slug: 7-category

method: encoded

Topic

  • name: English and Chinese title with special characters / 中文标题 !@:?\:'#^& $%&*()`
  • url: http://localhost:3000/t/English-and-Chinese-title-with-special-characters--%E4%B8%AD%E6%96%87%E6%A0%87%E9%A2%98-%5C%60%5E-%25/16
  • slug: English-and-Chinese-title-with-special-characters--%E4%B8%AD%E6%96%87%E6%A0%87%E9%A2%98-%5C%60%5E-%25

screenshot 2015-04-28 17 04 21

Category

parent

  • name: 中文
  • input_slug: (none)
  • url: http://localhost:3000/c/%E4%B8%AD%E6%96%87
  • slug: %E4%B8%AD%E6%96%87

screenshot 2015-04-28 17 06 08

sub

  • name: Second 第二个
  • input_slug: (none)
  • url: http://localhost:3000/c/%E4%B8%AD%E6%96%87/Second-%E7%AC%AC%E4%BA%8C%E4%B8%AA
  • slug: Second-%E7%AC%AC%E4%BA%8C%E4%B8%AA

screenshot 2015-04-28 17 07 49

sub, but parent category is meta

  • name: meta subcategory 中文
  • input_slug: (none)
  • url: http://localhost:3000/c/meta/meta-subcategory-%E4%B8%AD%E6%96%87
  • slug: meta-subcategory-%E4%B8%AD%E6%96%87

screenshot 2015-04-28 17 11 19

seed when default site setting are: default_locale="zh_CN" and slug_generation_method="encoded"

It’s meta category and its category topic:

  • category url: http://localhost:3000/c/%E7%AB%99%E5%8A%A1
  • category topic url: http://localhost:3000/t/%E5%85%B3%E4%BA%8E%E5%88%86%E7%B1%BB%EF%BC%9A%E7%AB%99%E5%8A%A1/2

screenshot 2015-04-28 17 17 16 screenshot 2015-04-28 17 17 21

Would it be possible to rename the English method to Latin, ASCII or something like that? English is somehow misleading…

@gschlager Would you mind mentioning that on meta again?..

Rebased and updated as @gschlager suggested.

Getting closer.

The implementation has a bunch of fallbacks on the client / server between various slug methods, should it not simply switch its logic based on the site setting? its seems much more correct.

Finally figured it out. Unicode as URL works out of box, the only need is to sanitized the “meaningful character” and store sanitized character as slug. @SamSaffron Now everything works without fallback method.

method: ascii

Topic

  • name: English title with Chinese sanitized and some special characters!@:?\:'#^& $%&*()`
  • url: http://localhost:3000/t/english-title-with-chinese-sanitized-and-some-special-characters/11
  • slug: english-title-with-chinese-sanitized-and-some-special-characters

screenshot 2015-05-04 18 41 24

Category

  • name: English Category with CJK slug 英文分类
  • input_slug: english slug 英文分类
  • url: http://localhost:3000/c/english-slug
  • slug: english-slug

screenshot 2015-05-04 18 41 15

method: none

Topic

  • name: Ignores title text but get default slug 默认 !@:?\:'#^& $%&*()`
  • url: http://localhost:3000/t/topic/13
  • slug: topic

screenshot 2015-05-04 18 43 34

Category

  • name: NoneCategory
  • input_slug: (none)
  • url: http://localhost:3000/c/6-category
  • slug: 6-category

method: encoded

Topic

  • name: English and Chinese title with special characters / 中文标题 !@:?\:'#^& $%&*()`
  • url: http://localhost:3000/t/English-and-Chinese-title-with-special-characters-%E4%B8%AD%E6%96%87%E6%A0%87%E9%A2%98/21
  • slug(stored): English-and-Chinese-title-with-special-characters-中文标题

screenshot 2015-05-04 19 43 11

Category

parent

  • name: 中文
  • input_slug: (none)
  • url: http://localhost:3000/c/%E4%B8%AD%E6%96%87
  • slug(stored): 中文

screenshot 2015-05-04 19 45 03

sub

  • name: Second 第二个
  • input_slug: (none)
  • url: http://localhost:3000/c/%E4%B8%AD%E6%96%87/Second-%E7%AC%AC%E4%BA%8C%E4%B8%AA
  • slug(stored): Second-第二个

sub, but parent category is meta

  • name: meta subcategory 中文
  • input_slug: (none)
  • url: http://localhost:3000/c/meta/meta-subcategory-%E4%B8%AD%E6%96%87
  • slug(stored): meta-subcategory-中文

@fantasticfears please test this : تست فارسی thank you

This is looking good to me @ZogStriP can you review this and merge if it also looks good to you?

There’s no need to pass the 2nd argument since it’s the default value :wink:

  1. since this site setting is an enum, I don’t think the else case can be reached.
  2. I’d rather default to a default “generator” than raise an exception.

I tend to prefer writing it like this

def self.ascii_generator(string)
   string.gsub("'", "")
         .parameterize
         .gsub("_", "-")
end

Can’t you use the URI::REGEXP::PATTERN here instead? Or is it different?

else case and 2nd parameters in the topic.rb is my personal prefers…try to catch impossible things. I’ll remove the duplicate 2nd parameters and this else case.

@ZogStriP URI::REGEXP::PATTERN follows RFC 2396. This regex is a super set of it. Almost all special characters are sanitized by this regex. It’s a convenient way to avoid any edge cases.