The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard. The result of the discussion was  Approved.

Operator: Kanashimi (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 10:03, Wednesday, April 14, 2021 (UTC)

Function overview: Sorting category of Thai names

Automatic, Supervised, or Manual: Automatic

Programming language(s): wikiapi on GitHub

Source code available: 20210416.Sorting category and sort key of Thai names.js on GitHub, 20210422.Sorting category and sort key of Thai names.js on GitHub

Links to relevant discussions (where appropriate): Wikipedia:Bot_requests#Category sorting for Thai names

Edit period(s): weekly

Estimated number of pages affected: hundreds up to 5,500

Namespace(s): Articles

Exclusion compliant (Yes/No): Yes

Function details: Please refer to Wikipedia:Bot_requests#Category sorting for Thai names. --Kanashimi (talk) 10:03, 14 April 2021 (UTC)[reply]

Discussion

[edit]
Operations till now...
  1.  Done Insert ((Thai people category)) to list in Wikipedia:WikiProject Thailand/Thai name categories
  2.  Doing... Fix Wikipedia:WikiProject Thailand/Thai name sort keys... @Paul 012: It seems there are some bugs(?) in the list, I fix them just now. Are the modification I made right? By the way, is "Same" means "Article title is the same with default sort"? And how should we do if there are redirects? (e.g., Namkabuan Nongkee PahuyuthNamkabuan Nongkeepahuyuth, Samuenthep Por PetchsiriParnthep V.K.Khaoyai and other 8 pages now) --Kanashimi (talk) 03:14, 17 April 2021 (UTC)[reply]
    Kanashimi, the intention was for the Same column to indicate pages which should be tagged with ((Thai sort same as defaultsort)). They are the rows with empty Thai sort values. (I don't think we need to consider whether the article title is the same as the Defaultsort; even if they are, I think a manual Defaultsort value should still be given, to prevent later mistaken automatic additions.) As for the redirects, which are the result of recent page moves, I've now updated the corresponding entries in the table. --Paul_012 (talk) 05:21, 17 April 2021 (UTC)[reply]
Looks like I got the meaning of the word "Same" wrong. I have changed the title. --Kanashimi (talk) 06:19, 17 April 2021 (UTC)[reply]
@Paul 012: I find there are already some sort keys in the page. Do we need to overwrite them? e.g.,
  1. Krissada Sukosol Clapp: The default sort key of ((DEFAULTSORT:Clapp, Krissada Sukosol)) will set to "Sukosol Clapp, Krissada"!
  2. Krissada Sukosol Clapp: The sort key of Krissada Terrence will set to "Krissada Sukosol Clapp"!
  3. Krissada Sukosol Clapp: The sort key of Krissada Terrence will set to "Krissada Sukosol Clapp"!
  4. Nicolene Pichapa Limsnukan: The default sort key of ((DEFAULTSORT:Limsnukan, Pichapa)) will set to "Limsnukan, Nicolene Pichapa"!
  5. Vachirawit Chivaaree: The default sort key of ((DEFAULTSORT:Chiva-aree, Vachirawit)) will set to "Chivaaree, Vachirawit"!
And I find many pages not listed in Wikipedia:WikiProject Thailand/Thai name sort keys. Do I need to list up them in a page? Is there a good page name for this? For these pages, do we just need to Modify sort key of categories in Thai_CATEGORY_LIST, as [[Category:Category name|PAGENAME]]? --Kanashimi (talk) 04:40, 18 April 2021 (UTC)[reply]
@Kanashimi: Yes, the five examples above are all cases where the existing sort keys/defaultsort should be overwritten. As for pages that are not listed, I identified 1,227 non-redirect items via this Petscan query; do they match your list? Most of them should be either non-Thai names, single-word names, or non-biographies, so yes, only PAGENAME sort keys should be added to the Thai categories, without touching the defaultsort values. (There are a few recently created articles, which I think will be easier to manually deal with.) I'll go through some examples to make sure we're on the same page.
  • 2007 Thai House of Representatives is a non-biography page, so it should be ignored (though Category:House of Representatives of Thailand shouldn't have been on the list; I've removed the Thai name category template now). So is Abhisit cabinet, Australians in Thailand, etc., and all the List of... articles. Armchair (band) is about a band, not a person, so articles like these should also be regarded as a non-biographies. (There were a few biographical articles that were missing birth/death year categories; I've gone through and added them, so all articles without such categories are now non-biographical.)
  • Abbas Sarkhab is a non-Thai name, so the only change should be to add the sort key [[Category:Police Tero F.C. players|Abbas Sarkhab]]. Similarly, Abdoul Karim Sylla (footballer, born 1981) should have [[Category:Uttaradit F.C. players|Abdoul Karim Sylla]] (without the parenthetical disambiguator).
  • Abbhantripaja is a single word, so no sort key is necessary, and it should be skipped. I forgot to account for cases like this in the original discussion; can this be added to the logic?
  • Similarly, Aguinaldo (footballer) is one word (excluding the disambiguator) and should be likewise skipped.
  • Abu Samah Mohd Kassim currently has no defaultsort value. Alef Vieira Santos currently has ((DEFAULTSORT:Alef Vieira Santos)), which is identical to the page name. Abbas II of Egypt has ((DEFAULTSORT:Abbas II Of Egypt)), which differs only in capitalisation. Adding PAGENAME as sort keys will result in no change, so all of these cases should also be skipped.
There are a few complicated cases, mostly involving foreign royalty. Examples include Frederick IX of Denmark (((DEFAULTSORT:Frederick 09 Of Denmark))), Prince Gustav of Denmark (((DEFAULTSORT:Gustav Of Denmark, Prince))) These should all be skipped, though I'm not sure if there's a common pattern that could identify them. Maybe they'll have to be manually listed?
Also, I'm thinking maybe ((Thai sort key not needed)) might be a better, more self-explanatory name for the ((Thai sort same as defaultsort)) template. --Paul_012 (talk) 14:28, 18 April 2021 (UTC)[reply]
@Paul 012: Well, I start to do some test edits (please search "Maintain sort key of Thai peoples"). Please tell me how about the results. --Kanashimi (talk) 06:16, 19 April 2021 (UTC)[reply]
The bot worked as expected in almost all cases, but it seems to still have problems with cases where the value in the Thai sort = Default sort column is yes. For Phra Chenduriyang (diff), the desired result would have been to:
  1. Update the defaultsort value to ((DEFAULTSORT:Chenduriyang, Phra)) (no change since this matches the value already in the article)
  2. Add ((Thai sort key not needed)) above the defaultsort line
  3. Skip to the next article without adding sort keys to individual categories.
I noticed a few cases where categories that should have had ((Thai people category)) added were missed; that was an oversight on my part, and I've manually added the template. I've also added foreign royals to the top of Wikipedia:WikiProject Thailand/Thai name sort keys. Please note that they will need to be treated differently from other items, as no change should be made to their defaultsort values. --Paul_012 (talk) 09:47, 19 April 2021 (UTC)[reply]
@Paul 012: I fixed the mistake of ((Thai sort key not needed)) and do some more test edits (please search "Maintain sort key of Thai peoples", after 10:15, 19 April 2021 diff hist +60‎ m Tamarine Tanasugarn ‎ Maintain sort key of Thai peoples:). May you take a look at them? Thank you. --Kanashimi (talk) 10:29, 19 April 2021 (UTC)[reply]
Thanks. The addition of ((Thai sort key not needed)) works properly now. The only errors here are non-biographical pages getting included in the edits: Royal Standard of Thailand (diff), Supreme Council of State of Siam (diff), Royal flags of Thailand (diff), The King Never Smiles (diff), Chakri dynasty (diff), and Australians in Thailand (diff). Have you implemented the check yet? --Paul_012 (talk) 10:52, 19 April 2021 (UTC)[reply]
@Paul 012: Sorry. I implement the function just now. I have do some more edits (from "13:45, 19 April 2021 diff hist +29‎ m Vapi Busbakara"). Are they seems OK? By the way, do I need to create a page to log all the pages I detect as non-biography, to prevent something missed? --Kanashimi (talk) 13:59, 19 April 2021 (UTC)[reply]
The new results are all good. Having a log would be useful, though would probably only be needed for the initial run (and not the subsequent weekly(?) updates). I would like to see some edits confirming that the bot respects the "don't change" values in the table; could you have it run through Category:Knights Grand Cordon of the Order of Chula Chom Klao? Also, could you demonstrate the behaviour for the subsequent weekly(?) runs, where it does not refer to the table and does not change defaultsort values? (Category:Thai people of American descent could be used for this, since the bot has already gone through its contents once, and there are some changes that should be made since I just tagged a few missing categories.) --Paul_012 (talk) 18:08, 19 April 2021 (UTC)[reply]
@Paul 012: I run through Category:Knights Grand Cordon of the Order of Chula Chom Klao, Category:Thai people of American descent and Category:Thai female Phra Ong Chao just now. Please take a view. According to the implement now, the bot always refer to Wikipedia:WikiProject Thailand/Thai name sort keys every time, so the table MUST keep updated. --Kanashimi (talk) 21:44, 19 April 2021 (UTC)[reply]
The don't change instructions aren't generating the desired behaviour. See Frederick IX of Denmark (diff), Princess Irene of the Netherlands (diff), etc. Since the entries have Thai sort = Default sort set to yes, ((Thai sort key not needed)) should have been placed and no category sort keys added. (Alternatively, it'd be okay to skip these pages altogether; I can manually add the template later.)
As for the current implementation depending on the table, that's fine for the initial run, but it would be impractical to keep the table updated into the future. If it can't be tested now, I'm not sure if a second BRFA would be needed for the updated future task. (The major functional differences would be skipping the defaultsort change and finding ((Thai sort key not needed)) on a page instead of looking up the Thai sort = Default sort column in the table.) --Paul_012 (talk) 03:57, 20 April 2021 (UTC)[reply]
I fix the code and it seems works now. --Kanashimi (talk) 06:00, 20 April 2021 (UTC)[reply]

the long term task

[edit]

@Paul 012: It seems that this is what we want to do in the future:

  1.  Done Get all pages of Thai_name_categories (categories transcluding ((Thai people category))).
  2. If the page transcluding ((Thai sort key not needed)):
    1. Remove all sort key of Thai_name_categories -- It is possible for the bot.
  3. else (page does not transcluding ((Thai sort key not needed))):
    1. Here comes the problem: How do the bot determine the DEFAULTSORT and the sort key of Thai_name_categories without Wikipedia:WikiProject Thailand/Thai name sort keys for a new created page? --Kanashimi (talk) 06:01, 20 April 2021 (UTC)[reply]
In the future, I expect that the bot should always leave DEFAULTSORT values unchanged. (For new articles, the correct values should be added manually when creating the page.) For 3.1, pages which do not transclude ((Thai sort key not needed)), the bot should follow the same behaviour as currently done for pages not in the table, e.g. Amanda Mildred Carr (diff). I.e., the sort key for Thai_name_categories should be the PAGENAME, excluding (parenthetical disambiguators). For 2.1, I don't think we want the bot to remove sort keys; there may be cases where manually added keys are still needed in some categories. As with the current set-up, pages with single-word titles should also be skipped, as well as pages with empty DEFAULTSORTs or DEFAULTSORTs that are identical to the PAGENAME or differ only in capitalisation (e.g. Abu Samah Mohd Kassim and Alef Vieira Santos above). I'm not sure if this last check has already been implemented? (There are very few pages that will be affected, so it's hard to tell from the trial so far.) --Paul_012 (talk) 18:05, 20 April 2021 (UTC)[reply]
Since the operating mechanism changed, this part should implement after the initial run. @TheSandDoctor: May I start the initial run? It seems no problem now. So we can go further. Kanashimi (talk) 21:47, 20 April 2021 (UTC)[reply]
@Kanashimi: Were any edits made? --TheSandDoctor Talk 22:21, 20 April 2021 (UTC)[reply]
The initial run will works on most pages of categories transcluding ((Thai people category)). --Kanashimi (talk) 23:19, 20 April 2021 (UTC)[reply]
I think TheSandDoctor was asking about the trial run. Here are the filtered contributions: 1, 2, 3, 4, 5 --Paul_012 (talk) 03:18, 21 April 2021 (UTC)[reply]
@Kanashimi: Paul_012 was correct. I just wanted to confirm whether it was an extended trial being asked for or if you could just continue the one you are on. How many edits would you like to see for the extended? I can set it to whatever works best for you. --TheSandDoctor Talk 04:09, 21 April 2021 (UTC)[reply]
I want to continue the task I am doing now. The code is listed here: 20210416.Sorting category and sort key of Thai names.js on GitHub. The code is, as you see above, tested. It will about thousands of pages. After the initial run, I will try the long term task mentioned above. Kanashimi (talk) 04:47, 21 April 2021 (UTC)[reply]
To clarify, the task is composed of two parts: (1) a one-time initial run (affecting up to some 5,000+ articles by my count (I've modified the request figure above), though likely less since pages which are already correctly formatted should be unaffected), and (2) a repeating long-term task (covering the same articles, but only as updates are needed following page category changes). From the trials thus far, the bot seems to be almost ready for the initial run, but a further trial period may still be needed after that to adjust the bot for the long-term task.
Before that is considered, I would still like to see another test, though. Kanashimi, could you perform another test to confirm that the following articles are skipped by the bot?
These are all members of Category:Chiangrai United F.C. players. --Paul_012 (talk) 08:38, 21 April 2021 (UTC)[reply]
 Done Well, maybe we can remove the DEFAULTSORT of these articles and insert ((Thai sort key not needed))? Kanashimi (talk) 09:49, 21 April 2021 (UTC)[reply]

Hmm. So the check isn't currently implemented? Would it be complicated to do so? In any case, the DEFAULTSORTs should not be removed (Brazilian names seem to have a separate set of rules which I'm not quite familiar with). Using ((Thai sort key not needed)) (and marking the DEFAULTSORT as don't change, similarly to the foreign royals) could technically work, but the problem is that I have no idea which articles would need to be marked as such, and can't list them without manually checking the DEFAULTSORT for each and every one of them. It would be much simpler if the bot could automatically do the check and skip them altogether. On the other hand, if it's technically costly, I guess these edits and the need to skip them could be ignored, as they aren't causing any harm (apart from being unnecessary edits, but I don't expect that there are more than a few dozen cases.) --Paul_012 (talk) 11:32, 21 April 2021 (UTC)[reply]

I need a clear rule for this kind of articles. Is this right? For all Thai people pages not listed in Wikipedia:WikiProject Thailand/Thai name sort keys,
  1. If the DEFAULTSORT is the same as the page article (excluding the disambiguator)
    1. then do the same thing as don't change
  2. else (the DEFAULTSORT is NOT the same as the page article excluding the disambiguator)
    1. Modify sort key of Thai categories, as page name. Kanashimi (talk) 11:37, 21 April 2021 (UTC)[reply]
If the is rule is already being checked, it shouldn't be necessary to add ((Thai sort key not needed)). I think it'd be better to skip the article entirely (the same treatment as articles with single-word titles). Also, I forgot about diacritics, which should be removed for comparison (and for the sort keys). So the rule would be: For all Thai people pages not listed in Wikipedia:WikiProject Thailand/Thai name sort keys,
  1. If (the page title is a single word OR a single word plus a disambiguator) OR (the page has no DEFAULTSORT) OR (the DEFAULTSORT, with diacritics removed, is the same (case-insensitive) as the page title (excluding the disambiguator), with diacritics removed)
    1. then skip the page
  2. else (the DEFAULTSORT is NOT the same as the page article excluding the disambiguator)
    1. Modify sort key of Thai categories, as page name (excluding the disambiguator), with diacritics removed.
--Paul_012 (talk) 12:08, 21 April 2021 (UTC)[reply]
 Fixed Are there other concerns or pages to test? Kanashimi (talk) 12:45, 21 April 2021 (UTC)[reply]
Thanks. Could you test the above pages again, to confirm that the fix works as intended? And also, as a final matter, a test-run that covers some of the following pages (all in Category:Thai male boxers) would be a good idea, since so far no multi-word surnames seem to have been included in the trial run yet.
--Paul_012 (talk) 13:23, 21 April 2021 (UTC)[reply]
 Done 6 pages skipped. Kanashimi (talk) 13:33, 21 April 2021 (UTC)[reply]
Seems good now; thanks a lot. TheSandDoctor, I believe the identified bugs have been addressed and don't expect problems going through with the entire initial run (phase 1). If approved, maybe the BRFA could be kept open so that discussion of phase 2 can continue later? --Paul_012 (talk) 13:47, 21 April 2021 (UTC)[reply]
I set the summary to "Maintain sort key of Thai peoples: Modify sort key of Thai categories, as page name." Please help me to change to a better summary, thank you. Kanashimi (talk) 01:15, 22 April 2021 (UTC)[reply]
I think, after the initial run, there will be few pages left. Maybe I can do some test edits for phase 2? I have do one test edit at Kiatprawut Saiwaeo. Kanashimi (talk) 02:22, 22 April 2021 (UTC)[reply]
I don't have a problem with the current edit summary, but to be a bit more precise I guess it could say "Maintaining sort keys in Thai-people categories: Per values in Wikipedia:WikiProject Thailand/Thai name sort keys" and "Maintaining sort keys in Thai-people categories: Adding page title as sort keys". --Paul_012 (talk) 18:46, 22 April 2021 (UTC)[reply]

To reiterate, I think the workflow for phase 2 should be something like this:

  1. Get all pages of Thai_name_categories (categories transcluding ((Thai people category))).
  2. If (the page is not a biography) OR (the page transcludes ((Thai sort key not needed))) OR (the page title is a single word OR a single word plus a disambiguator) OR (the page has no DEFAULTSORT) OR (the DEFAULTSORT, with diacritics removed, is the same (case-insensitive) as the page title (excluding the disambiguator), with diacritics removed)
    1. then skip the page
  3. else
    1. Modify sort key of Thai categories, as page name (excluding the disambiguator), with diacritics removed.

I realise this makes ((Thai sort key not needed)) kind of redundant for articles like Headache Stencil, Abha Barni Aed Carabao, where the DEFAULTSORT should match the title, but I don't think that should be a problem. --Paul_012 (talk) 18:46, 22 April 2021 (UTC)[reply]

Yes, I am now do the same thing as as stated above. The code is 20210422.Sorting category and sort key of Thai names.js on GitHub. It is ready to test now. Kanashimi (talk) 22:28, 22 April 2021 (UTC)[reply]
@Paul 012: Maybe we can run full task after be sure the phase 2 code is OK? Kanashimi (talk) 05:36, 23 April 2021 (UTC)[reply]
Here are a few articles that could be used to test phase 2.
By the way, I just noticed that Thai expatriate sportspeople in Foo categories have not been tagged with ((Thai people category)), though Thai expatriates in Foo have. Looking at this again, and considering the original wording of that template's documentation ("the category contains nearly exclusively people with a Thai name living in Thailand"), I'm thinking maybe Thai expatriates in Foo shouldn't have been tagged. I'll revert those edits. --Paul_012 (talk) 22:29, 23 April 2021 (UTC)[reply]
@Paul 012:  Done Pages that are not edited are skipped. Kanashimi (talk) 23:30, 23 April 2021 (UTC)[reply]
Seems to be working as expected. TheSandDoctor, I think both functions are ready pending approval: phase 1 (a one-time task) and phase 2 (on a continued weekly basis). --Paul_012 (talk) 06:34, 24 April 2021 (UTC)[reply]

Trial complete. --Kanashimi (talk) 21:22, 26 April 2021 (UTC)[reply]

@Paul 012 and TheSandDoctor:, you've worked with this task a lot more than I have, what are your opinions on the above trial? Primefac (talk) 13:58, 25 May 2021 (UTC)[reply]
Primefac, as far as has been shown, I am satisfied that the functions are working as intended. --Paul_012 (talk) 16:21, 25 May 2021 (UTC)[reply]
@Primefac: I'll defer to Paul_012 here as the requester. If they're happy, I'm happy. Feel free to proceed, Primefac. --TheSandDoctor Talk 05:24, 26 May 2021 (UTC)[reply]
 Approved. Primefac (talk) 10:38, 26 May 2021 (UTC)[reply]
The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard.