This is the talk page for discussing improvements to the Deep learning super sampling article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find video game sources: "Deep learning super sampling" – news · newspapers · books · scholar · JSTOR · free images · free news sources · TWL · NYT · WP reference · VG/RS · VG/RL · WPVG/Talk |
This article is rated C-class on Wikipedia's content assessment scale. It is of interest to multiple WikiProjects. | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
The article:
Weasel words are words and phrases aimed at creating an impression that something specific and meaningful has been said, when in fact only a vague or ambiguous claim has been communicated
I'd suggest this for deletion. 62.248.185.87 (talk)
double the native resolution performance, the article showed it did nothing of the sort - on the table in the article it says performance moved from 57 fps to 91. You wrote that the quality was the same
resulting image... have the same quality as the native- the article went out to point some of the shortcomings, such as
Hair's still a sticking point, however, and aliasing where the AI has struggled to upscale the source materialand while the article does go on to conclude there were
little to no expense, particularly the little part of that is not the same as the statement
same quality, which you insist on putting on the article. Furthermore none of the testing in the supposedly source material of these claims even is at 4k output resolution! Rather description under the table says
Control benchmarks were carried out at 1440p. What is the actual source for your claims? It clearly isn't this article. Also I already noted most of this in the edit summary, which you've conveniently ignored only to go on to rather launch further attacks against my character, which at this point include asking "do you ever read", accusations of vandalism, and stating that I had not read an article which I've spent hours commenting on. These repeated baseless accusations while refusing to read or address edit summaries is not even remotely close to good faith participation. 62.248.185.87 (talk) 09:29, 27 May 2020 (UTC)
Regarding the suggestion this article should be deleted: It's reasonable to raise that issue here but when there is opposition the only recourse would be to nominate the article for deletion at WP:AFD. Such a nomination would almost certainly fail because Google shows lots of different sources talking about the term in the context used by this article. Accordingly, notability is established and talk about deletion is a waste of time. As an uninvolved administrator, I am merely reporting standard procedure and have no opinion on the merits of this article. Johnuniq (talk) 07:05, 30 May 2020 (UTC)
Contradicts itself by both claiming that the technology is based on deep learning while simultaneously claiming a game with the technology was released without deep learning." The article says "Nvidia advertised DLSS" at launch in 2018 and a particular game was issued in 2019 where deep learning was not used. Is the tag justified? Johnuniq (talk) 07:16, 30 May 2020 (UTC)
(DLSS) is a technology developed by Nvidia, using deep learning", where as the release history states that the "
2.0 (first iteration)" does "
not use machine learning", which by definition includes deep learning. Also I would not propose deletion anymore. 62.248.185.87 (talk) 19:30, 30 May 2020 (UTC)
(in some versions)" to the article, it doesn't contradict itself, but I personally don't feel confident in removing the tags since I believe it will take a miracle for the remark to stay in the article for as long as until it can be clarified with a reliable and sourced reference and not just edit warred out of the article. I base this on the experience of how removing a single sentence entirely contradicted by it's citation out of this article got me character assassinated on my user talk and a mod talk page, how my revisions concerning 2716 word referenced article were undone seconds after making them while none of the concerns were or ever have been addressed, and worst of all these behaviors only lead to actions against me. 62.248.185.87 (talk) 21:07, 30 May 2020 (UTC)
103.27.230.134 (talk) 04:15, 19 August 2020 (UTC) DLSS 2.0 is still based on AI. What made the editor think DLSS 2.0 for Control doesn't use Deep Learning. It just says that DLSS 2.0 doesn't need specific training for each game. 103.27.230.134 (talk) 04:15, 19 August 2020 (UTC)
Previously one editor said that the article contradicted itself because DLSS is said to be based on AI, but the second version was not based on AI. It was not clear enough, partly because they did not change the release version between this one and the last recent one (the two of them are all named DLSS 2.0 by Nvidia). It think I clarified it, also because I added a history chapter. So I removed the banner about the clarification at the top of the article. Hervegirod (talk) 09:57, 16 April 2020 (UTC)
The sentence is: "which this time is said to use machine learning and don't need to be trained on every game it is applied to". If somebody could find a better wording... Hervegirod (talk) 09:58, 16 April 2020 (UTC)
DLSS 2.0 seems to be working really well with the last 2.0 version, as it has been confirmed by independent and reliable sources which did their own benchmarks. However, what we still don't know is if the technology is really working well on any game which use the Nvidia provided API, or if it only works on the specific games which enable it for now. Nvidia explained that DLSS 2.0 is not trained specifically on every game, but as only a few games use the latest version of this technology, we still don't know if it works effectively on any game which use it. I would have added this on the article, but I can't for the moment, because I could not find any source for that. Only time will tell if the "generic" term used by Nvidia is really true. Hervegirod (talk) 10:42, 19 April 2020 (UTC)
The phrase "video card overhead" has no established definition, and does not yield meaningful search results.
Searching for the phrase in google yields me 5 pages of results and total of 48 links.
There's one single reference in a book about 3d modelling software Maya seemingly using it in the way used in the article, which appears 4 times. But even that doesn't really specify what it means more than as lost performance.
Curiously there is a single reference in a gaming access weekly article written 3 days ago about this technology, however this article doesn't define it.
This seems to count as WP:Weasel as well, but whether it does or doesn't either way I don't think matters as it's not language that should be used on a Wikipedia article as it doesn't convey meaning. 62.248.185.87 (talk) 22:18, 29 May 2020 (UTC)
I just read the contradict tag added again by User talk:62.248.185.87 with this summary : "If they called it DLSS before it had deep learning then the article can't say it's a technology based on deep learning". It's not me who called it "Deep learning super sampling" but Nvidia, furthermore the sources I found state that:
I tried to clearly say and source that in the article, especially on the "Release history" chapter. It's not my fault if Nvidia was not always clear on their promotion of their own technology. Maybe the wording could be improved to make it clearer (it seems clear in this article for me, but I am the one who wrote the "Release history" chapter originally) Hervegirod (talk)
Information about DLSS 2.1 should be added. It apparently adds an ultra performance mode as a major differentiating factor from previous versions. Svetroid (talk) 20:31, 8 December 2020 (UTC)
I'll start before somebody goes fanboy on me by saying that I could care less which of the two video card companies is "winning" the game of screwing people into buying new cards / switching brands, but I have to comment on this part:
Tensor Cores are available since the Nvidia Volta GPU microarchitecture, which was first used on the Tesla V100 line of products. Their specificity is that each Tensor Core operates on 16 bits floating point 4 x 4 matrices, and seem to be designed to be used at the CUDA C++ level, even at the compiler level.
Aside from the bizarre grammar, "operates on..." is a given, the real specificity (and how there are so many of them) is that they can do approximately jack squat besides multiply, add, load, and store elements to produce said matrices. This has no immediate appeal to the consumer market.
Fabrication of a chip for something like the V100 produces lots of defectives which they needed something to do with since enterprise customers won't accept the board where 1/4 of the SMs won't produce accurate results and another 1/4 won't work at all when they're doing something critical like drug discovery or fusion plasma simulation, hence the invention of features like this where the LSB being wrong in an element 1/10000 times won't be noticed. Slap some RGB lights on it and you can market your e-waste to gamers. AMD and Intel do the same thing, so I'm not picking on anybody here.
"Designed to be used at the CUDA C++ level" is kind of a given since nVidia wants people locked in to CUDA, which again, everybody else tries to do, although AMD at least stuck with open standards.
"or even the compiler level" means absolutely nothing. You don't "use" the machine languages you're compiling for at the compiler level. You compile high level languages through possible multiple stages until you're at either target-specific assembly or binary machine code. Every processor feature in history (since HLLs have existed) was designed to be used at the compiler level, because writing programs in hexadecimal is painful even if you know the instruction set encoding like the back of your hand. Most people don't know or bother with learning assembly, either, so a feature that can't be implemented as a compiler feature / optimization might as well not exist. This is why Itanium was a failure, but I've already ranted enough and won't go into that.
Features like the tensor units that would require an expert at multithreaded programming to program manually and require constantly updated system state to run optimally aren't even left to the compiler, except to simplify things a little for the end user, who will then use a python module that interfaces with the c++ cuda libs because all those semicolons in c++ are too confusing for them. Then they'll release a pixel art metroidvania that requires tenorflow that only runs on one processor core and can't maintain 30fps at 1080p and complain about computers being too slow. See below for how the instructions actually work.
The Tensor Cores use CUDA Warp-Level Primitives on 32 parallel threads to take advantage of their parallel architecture. A Warp is a set of 32 threads which are configured to execute the same instruction.
Sorta... it's an insanely verbose instruction (but a warp primitive in cuda compiles down to multiple instructions as below) that's sent to the warp scheduler to be executed as a ton of individual instructions on 32 hopefully optimally scheduled and localized (no guarantees are made) tensor subunits, then reassembled into a result when they finish. This cuts down on silicon per multiply-add subunit of the tensor cores (no need to do anything but multiply and accumulate, and a couple of simple bitwise ops on integers) but as far as i can tell you're SoL if you want a 4x4 matrix multiply add, the smallest "shape" they list is 8x4. There is no way, afaik, to individually address these sub-processing units from anywhere but the warp scheduler on chip, except maybe overriding data distribution. This is what the assembly looks like for a 16x16 matrix multiplication, which requires 32 tensor units each scheduled internally to do the required series of 4x4 operations:
.global .align 32 .f16 A[256], B[256];
.global .align 32 .f32 C[256], D[256];
.reg .b32 a<8> b<8> c<8> d<8>;
wmma.load.a.sync.aligned.m16n16k16.global.row.f16 {a0, a1, a2, a3, a4, a5, a6, a7}, [A];
wmma.load.b.sync.aligned.m16n16k16.global.col.f16 {b0, b1, b2, b3, b4, b5, b6, b7}, [B];
wmma.load.c.sync.aligned.m16n16k16.global.row.f32 {c0, c1, c2, c3, c4, c5, c6, c7}, [C];
wmma.mma.sync.aligned.m16n16k16.row.col.f32.f32 {d0, d1, d2, d3, d4, d5, d6, d7}, {a0, a1, a2, a3, a4, a5, a6, a7}, {b0, b1, b2, b3, b4, b5, b6, b7}, {c0, c1, c2, c3, c4, c5, c6, c7};
wmma.store.d.sync.aligned.m16n16k16.global.col.f32 [D], {d0, d1, d2, d3, d4, d5, d6, d7};
So that's the level it's *designed* to be used at. Since most humans can't smoke enough meth without dying to want to write it like that, compiler support for the PTX instructions is constantly worked on and a CUDA API created. I went through that pedantic mess because the wording implies that being designed for use in c++ (it was either that or C) and goes on with "wow you can even use it in compilers as if that's some super elite feature or even makes any sense. The miracle would be if it's supported by any compilers that nVidia employees didn't add the feature to. As a former compiler engineer I'd rather mate with a garbage disposal than have to implement just the backend encoding for their instruction set, let alone an assembler / assembly printer or any kind of language support.
The big snafu as far as i can tell is that GPUs don't do conditional speculative execution in the normal sense. If there are multiple code paths, they execute all of them at once and throw away the results for anything not needed. This makes the AI / deep learning sound fancy but it's really a way to minimize the number of times that happens on a user machine by pre-deciding most things and letting the computer run the resulting state machine. The deep learning part of it never happens on the GPU of the DLSS end user, it's just running the net nvidia trained, which the article got right... and I'd be interested to see how much of the tensor core capacity is in actual use with features like this. I'll quit my rant here and fix the grammar and try to improve the paragraph a bit later, but I couldn't help myself, marketing speak and the current disaster that is the GPU market price that resulted from it angers me.
Cheers ~~---- A Shortfall of Gravitas (other machine) (talk) 08:22, 4 April 2021 (UTC)
References
((cite web))
: no-break space character in |title=
at position 30 (help)
Currently the second paragraph claims that "As of June 2021, this technology is available exclusively on GeForce RTX 20 and GeForce RTX 30 series of graphics cards.". Aside from being an obviously false claim, it seems to me to be a clear violation of MOS:PUFFERY. Either way this untrue claim has stood in the article since the creation when Hervegirod put it in. It's also contradicted later in the article. 62.248.185.4 (talk) 20:31, 7 May 2022 (UTC)
How should it be spelled? "Tensor cores", "Tensor Cores", or "tensor cores"?
--Mortense (talk) 13:31, 25 November 2022 (UTC)
Neither Nvidia nor Remedy have ever confirmed a thing such as DLSS 1.9 existing. When Remedy implemented DLSS 1.0 into the game Control it surpassed the quality of similar implementations in other games.
After the release of DLSS 2.0 someone coined the previous implementation DLSS 1.9 which spread like wildfire.
But it was only coined because of it's qualitative similarity to 2.0. Which never excluded Remedy and Nvidia working together to provide a better implementation (and better machine learning training).
Which does not at all imply it being a different implementation or version of DLSS 1.0 at all. Only that Remedy's own TAA implementation and their implementation of DLSS 1.0 was an improvement.
But the same could have been said about the implementations of DLSS 1.0 in Metro Exodus and for example Battlefield V. Both of which had implementations so vastly differing in quality that one could imply they were different versions of DLSS 1.0 when they were not.
This self proclaimed fact of DLSS 1.9 being an actual differing iteration from DLSS 1.0 comes from the perception that DLSS 1.9 is/was the same (or almost the same) algorithm as DLSS 2.0 but ran on shader cores instead of shader cores.
This think is fueled by the presumption (and common disdain for Nvidia marketing and sales tactics) that Nvidia forced obsolescence by making DLSS 2.0 only work on RTX GPUs when it could have worked on all GPUs as shader-based implementation.
Which is completely unfounded, biased and complete speculation.
I believe much of this is based on a wishful perception by opponents of Nvidia and an unfortunate miswording or misunderstanding by Techspot.com who likely coined the term or at the very least picked it up in a more official journalistic capacity and therefor helped its propagation. (see https://www.techspot.com/article/1992-nvidia-dlss-2020/)
The only facts that remain are that only DLSS 1.0 and DLSS 2.0 officially ever existed as upscaling techniques and that within releases of games with DLSS 1.0 support we always had vast differentials in image quality.
These differences have never been proven to be anything but the quality of implementation and quality of game specific machine learning data.
Which as we know was one major factor in the different qualities of DLSS 1.0 implantations. The fact that DLSS 1.0 had to be trained on a per-game basis.
I therefor conclude that, until Nvidia ever officially states otherwise, DLSS 1.9 never existed.
Remedy's implementation of DLSS 1.0 simply outshone other implementations either by amount of effort to their own engine or/and to the amount of game specific machine-learned training that made people believe it was a different version entirely. Fnna509 (talk) 11:49, 13 August 2023 (UTC)
DLSS 3 Has 2 main components:
1. Upscaling
2. Frame Generation
While DLSS Frame Generation is indeed exclusive to Ada Lovelace NVIDIA GPUs, DLSS 3 (Subsequently, the third generation of DLSS) Upscaling isn't. 85.198.63.121 (talk) 16:07, 30 October 2023 (UTC)