The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard. The result of the discussion was  Approved.

New to bots on Wikipedia? Read these primers!

Operator: Sheep8144402 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 16:08, Friday, October 21, 2022 (UTC)

Function overview: Fix any font tag Linter error

Automatic, Supervised, or Manual: automatic, may be supervised to reduce mistakes

Programming language(s): AWB

Source code available: User:SheepLinterBot/1 for regexes, User:SheepLinterBot/1/Signature submissions#Completed per the table

Links to relevant discussions (where appropriate): 1 (especially this) 2 3 4

Edit period(s): varies

Estimated number of pages affected: varies, usually few hundred to few thousand for one sig, millions in total

Namespace(s): any applicable that have obsolete font tags linter errors updated 26 November 2022

Exclusion compliant (Yes/No): no

Function details: (This BRFA was originally made to fix TWA-related linter errors but was withdrawn (and postponed) because I kinda changed my mind back then.) Fixes any signature with font tag linter errors I may request that the bot fix, so the estimated number of pages may vary per sig. The linter errors it fixes varies depending on what I put in the queue, although I may use regex expressions to try to clear all the other font tags linter errors at once.

Originally it replaces MalnadachBot 12 due to issues involving many edits in a single page to fix linter errors; you can see here why the bot makes many edits to a single page to fix linter errors. Some of the regexes come from here to start and then I came up with more to minimize the number of font tags being left over after an edit.

Edit as of 22 December 2022: Originally it was planned that I fix signatures that have other linter errors as well, but because doing such on base user talk pages triggers the "you have new messages" notification even when the edit is minor for my main account, I will also request approval to fix signatures that have other Linter errors as well, i.e. missing end tags. Actually, nevermind; that is gonna be left for MalnadachBot 12. This bot task aims to take over MalnadachBot fixing font tags.

Discussion[edit]

I have checked all of the 100 latest edits. There are some errors:

If you have made more such edits from your main account, please fix them. IMO it is okay if the bot skips more pages, the important thing should be that the bot does not replace one error with another. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 15:00, 3 December 2022 (UTC)[reply]

  • Update: An extended trial may also be used for this since I also requested approval to fix signatures with other Linter errors as well. Originally it was made so that I fix signatures with other Linter errors, but many of such signatures appear on User talk pages, which when I edit them, trigger a notification to the affected users. Some examples for this:
All of these result in 964 user talk pages edited, which will trigger a notification to most users, and which is not a good thing for me. The bot will still skip pages with font tags still remaining. Edit: The true number is far higher than this since User talk namespace has the most obsolete tags of all namespaces. Sheep (talkhe/him) 21:04, 22 December 2022 (UTC)[reply]
Actually, that's gonna be left for MalnadachBot 12. I don't feel like taking over the entire MalnadachBot task 12 since there is already approval for that task. ((BAG assistance needed)) I do feel like an extended trial to ensure there are no errors; I also may develop regexes to maximize the number of pages edited while avoiding errors. However no response by BAG within past few weeks since the trial was complete. Just so you know, the whole purpose I do tests of edits (usually a hundred, this is an example) is to ensure the bot operates correctly when using regular expressions to fix these tags. Sheep (talkhe/him) 21:42, 25 December 2022 (UTC)[reply]
I'm currently listing a sample of signatures on this page. Basically whatever signatures get submitted are processed and listed in a page similar to this. Mean page size is calculated by adding up the kilobytes of these pages and then dividing by the number of pages with the signature. Either an extended trial through the sample or a bunch of random pages is fine for me. Sheep (talkhe/him) 04:24, 2 January 2023 (UTC)[reply]

Trial 2[edit]

Approved for extended trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 11:24, 11 January 2023 (UTC)[reply]

Trial complete. (contribs) No errors this time. I used the sample so to demonstrate what I mean by this task. While I was going across the sample, I used the ten signatures' replacements along with the regex replacements so that they can all be replaced at the same time. Some points I want to consider while doing this trial:
  • I skipped base user talk pages since the bot account is currently unflagged and editing base user talk pages would trigger a notification to them, which I don't want.
  • The "font style/class" regex is designed to catch every character, but if it has color, the wikilink color wouldn't render. Because of this, I excluded square brackets from the set of characters the regex would check, and put <font class=... as a separate regex; however, that means the bot will skip pages with <font style="background:... wrapping wikilink(s) and similar skipped pages with such instances but will use the font style/class regex for now. Nevertheless you can get the bot to fix such instances like these correctly on User:SheepLinterBot/1/Signature submissions. Input a signature like <font style="color:...">[[wikilink target|something]]</font> and the bot will fix the signature correctly ([[wikilink target|<span style="color:...">something</span>]]). updated 14:10, 12 January 2023 (UTC)
  • The first set of regexes that fix font tags is considered twice, before and after processing the page, and the equal sign is now considered in the set of characters to check between font tags, which is why some consecutive font tags were fixed.
  • <b>[[User:Deiz|<FONT STYLE="verdana" COLOR="#000000">Dei</FONT><FONT COLOR="#FF3300">zio</FONT>]]</b> can be caught by regexes and turns into <b>[[User:Deiz|<span style="verdana;color:#000000;">Dei</span><span style="color:#FF3300;">zio</span>]]</b>, but I don't know if that is the correct replacement for the signature since font style="verdana" isn't correct. In earlier times, whenever I stumbled with that signature, I would replace it with <b>[[User:Deiz|<span style="font-family:verdana;color:#000000;">Dei</span><span style="color:#FF3300;">zio</span>]]</b> since "verdana" is a font face and I thought it would be acceptable to use "font-family" CSS as the replacement. And in later times I thought it was not valid so I just replaced it with <b>[[User:Deiz|<span style="color:#000000;">Dei</span><span style="color:#FF3300;">zio</span>]]</b>. I do not know what I should go with.
Otherwise, everything should be AOK. Sheep (talkhe/him) 01:17, 12 January 2023 (UTC)[reply]
For things like User:Diez's signature, I would replace it with what it looks like currently with valid css. Most users just move markup around in their signature till they get something they like. This seems like they were experimenting with adding font family before settling on something that doesn't render it. I would replace it with <b>[[User:Deiz|<span style="color:#000000;">Dei</span><span style="color:#FF3300;">zio</span>]]</b>. The trial looks good otherwise. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 02:18, 12 January 2023 (UTC)[reply]
As noted as to why the regexes that fix font tags with color attribute only are used multiple times, that is because each time they are used, they fix only one instance of consecutive font tags. This is an instance of me using the first four regexes seven times to fix font tags.
Regexes work fine most of the time, but there are edge cases where they sometimes don't work properly with the equal sign to check in the character set. They are useful for fixing some consecutive font tags; however when the equal sign is used, not all font tags get replaced. Using the regex \< *font +size *\= *(\"|\'|) *(0|1|1px|-[2-5]) *(\"|\'|) *\>(.+)\<\/ *font *\>, <font size="-2">foo</font>bar<font size="-2">baz</font> gets replaced with <span style="font-size:x-small;">foo</font>bar<font size="-2">baz</span>. The regex was supposed to catch the first closing font tag but it instead went for the second. I do not know why. Sheep (talkhe/him) 01:09, 13 January 2023 (UTC)[reply]
Just a note that it's because I did not make the quantifier lazy. \< *font +size *\= *(\"|\'|) *(0|1|1px|-[2-5]) *(\"|\'|) *\>(.+)\<\/ *font *\> turns <font size="-2">foo</font>bar<font size="-2">baz</font> into <span style="font-size:x-small;">foo</font>bar<font size="-2">baz</span>, but \< *font +size *\= *(\"|\'|) *(0|1|1px|-[2-5]) *(\"|\'|) *\>(.+?)\<\/ *font *\> turns <font size="-2">foo</font>bar<font size="-2">baz</font> into <span style="font-size:x-small;">foo</span>bar<span style="font-size:x-small;">baz</span>. However, that regex is now edited to include <font size="1.5">foo</font> and similar; it is now \< *font +size *\= *(\"|\'|) *([0-1]|1px|[0-1]\.[0-9]*|-[2-5]) *(\"|\'|) *\>(.+?)\<\/ *font *\>. Problem solved. However that means it can't fix font tags inside and outside wikilink(s) properly unless I reorder the first set of regexes, which I did; font tag regexes with two attributes go first.
There should be no instance of this happening since the bot would skip pages that still have font tags linter errors. In case you don't know, the order for fixing font tags goes as follows: signature replacements, to regex replacements, to the first four regexes six more times. Sheep (talkhe/him) 17:05, 13 January 2023 (UTC)[reply]
Also noting that I have now coded regexes to fix font tags with color face and size. Hopefully it can further increase the edits % on average. Sheep (talkhe/him) 17:32, 13 January 2023 (UTC)[reply]
I've done three hundred edits once again and I will now compare the two tests with the edit-to-page ratio, which measures the % of checked pages edited while checking pages.
Notice the comparison that the ratio has increased closer to 1:1. It is not possible to get it to exactly 1:1 (achieveable using a perfect set of regexes), although I will still code regexes to catch more font tags. I did three hundred rather than one hundred so the ratios could be more accurate. Unfortunately, this will be the last test before the bot task is approved. (Update as of 14:51, 17 January 2023 (UTC): I am going to do the very last test of 300 edits in the upcoming hours since as of this post I am in high school right now, so I cannot use AWB during my school hours.)
I would like to point out one thing that would make the page harder to read. When using the second set of regexes to fix font tags, for some reason when there's already the same tag in the wikilink, another exact tag would be added. For example, <font color="red">([[foo|<font color="red">bar</font>]])</font> would be replaced with <span style="color:red;">([[foo|<span style="color:red;"><span style="color:red;">bar</span></span>]])</span>. I had skipped the page containing it despite it counting through the ratio, though that was done for accuracy reasons. Sheep (talkhe/him) 20:20, 15 January 2023 (UTC)[reply]
While this BRFA is open I will still continue to develop regexes to fix more font tags and skip fewer pages while keeping the error rate as low as possible. Before implementing them to AWB, I would test the regexes by using a fake signature in another website. In the meantime, since there are no errors in the extended trial, this can be approved, and then the process of fixing font tags can begin. Or, you can approve this for the last extended trial, with a mix of random pages and the sample. updated 14:51, 17 January 2023 (UTC) Sheep (talkhe/him) 02:26, 13 January 2023 (UTC)[reply]

Comparison of three tests of 300 edits[edit]

The very last test of 300 edits before this bot task is approved is now complete. Here are the results:

Test of 300 edits Regexes used Edits % (ratio)
15 December 2022 Special:Permalink/1127480757 60.5% (1:1.653)
15 January 2023 Special:Permalink/1133798228 65.6% (1:1.525)
18 January 2023 Special:Permalink/1134207566 63.1% (1:1.585)

1 page was skipped due to characters in the Unicode Private Use Area, and 1 page was skipped due to not having font tags (there was a false positive when trying to get pages with font tags), so they had to be discounted in the ratio. Also, I had to manually skip one page due to two consecutive span tags in a wikilink when trying to fix font tags wrapping one wikilink and text around it. Apparently with the font style/class regex, a signature ended up getting replaced with span tags outside a wikilink. So either I have to make it two separate regexes, or you can submit the signature to my submission page so the bot can get the fix correct.

Regexes are made to ignore external/interwiki links, images, nowiki, math and hidden comments. To know how strong my regexes are, I use two things for two scenarios. If the bot was made to not skip pages with font tags, I would use the font tag percentage. If the bot was made to skip pages with font tags, I would use the edit-to-page ratio. Currently I use the ratio because I will make the bot skip pages with font tags; it's better for other editors complaining about MalnadachBot making many edits to a single page to fix font tags, doing such creates fewer errors when editing pages, and it is also easier to calculate. After the last test of 300 edits, either this BRFA can move on to the last extended trial, with half of the edits made with random pages and the other half from the sample, or it can go to straight approval. Sheep (talkhe/him) 13:29, 18 January 2023 (UTC)[reply]

((BAG assistance needed)) No edit by BAG since seven and a half days. Sheep (talkhe/him) 00:12, 20 January 2023 (UTC)[reply]

For what it's worth, 7.5 days isn't really that long from a BAG perspective, though I do suppose 20 is pushing it...
 Approved. Primefac (talk) 11:32, 31 January 2023 (UTC)[reply]
The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard.