Creation of a 3D-drawing – is far not the trivial process demanding the complex approach, as to program, and to hardware maintenance. Modern professional graphic cards is the alloy of high technologies uniting in a single whole the powerful multinuclear graphic processor (GPU) with parallel architecture and the software, allowing to the full to involve all resources GPU.
Let's consider a difference between CPU and GPU in parallel calculations.
Growth of frequencies of universal processors has rested against physical restrictions both high power consumption, and the increase in their productivity even more often occurs for the account of placing of several kernels in one chip. Processors sold now contain only to four kernels (the further growth will not be fast) and they are intended for usual appendices, use MIMD – a plural stream of commands and the data. Each kernel works separately from the others, executing different instructions for different processes.
Specialised vector possibilities (SSE2 and SSE3) for four-componental (unary accuracy of calculations with a floating point) and two-componental (double accuracy) vectors have appeared in universal processors because of the increased requirements of graphic applications, first of all. For this reason application GPU is more favourable to certain problems, after all they are initially made for them.
For example, in video chips NVIDIA the mainframe is a multiprocessor with vosemju-ten kernels and hundreds ALU as a whole, several thousand registers and a small amount of divided general memory. Besides, the video card contains fast global memory with access to it of all multiprocessors, local memory in each multiprocessor, and also special memory for constants.
The most important thing – these some kernels of the multiprocessor in GPU are SIMD (a single stream of commands, set of data flows) kernels. And these kernels execute the same instructions simultaneously, such style of programming is usual for graphic algorithms and many scientific problems, but demands specific programming. But such approach allows to increase quantity of executive blocks for the account of their simplification.
So, we will list the basic distinctions between architecture CPU and GPU. Kernels CPU are created for execution of one stream of consecutive instructions with the maximum productivity, and GPU are projected for fast execution of a great number of in parallel carried out streams of instructions. Universal processors are optimised for the achievement of high efficiency of a unique stream of commands processing both integers and numbers with a floating point. Thus access to memory the casual.
Developers CPU try to achieve performance as it is possible more numbers of instructions in parallel, for productivity increase. For this purpose, since processors Intel Pentium, there was the superscalar performance providing performance of two instructions for a step, and Pentium Pro has caused a stir extraordinary performance of instructions. But parallel performance of a consecutive stream of instructions has certain base restrictions and increase in quantity of executive blocks of multiple increase in speed not to achieve.
At video chips work simple and parallelized it is primary. The video chip accepts group of ranges on an input, performs all necessary operations, and on an exit gives out pixels. Processing of ranges and pixels is independent, they can be processed in parallel, separately from each other. Therefore, because of initially parallel organisation of work in GPU the considerable quantity of executive blocks which are easy for loading, unlike a consecutive stream of instructions for CPU is used. Besides, modern GPU also can execute more than one instruction for a step (dual issue). So, architecture Tesla in some conditions starts on execution of operation MAD+MUL or MAD+SFU simultaneously.
GPU differs from CPU also by access principles to memory. In GPU it connected and easily predicted – if from memory is read texel structures time and for next texels after a while will come. And at record the same – the pixel registers in frame buffer, and through some steps will register close to it. Therefore the memory organisation differs from that that is used in CPU. And to the video chip, unlike universal processors, simply it is not necessary a cache-memory of the big size, and for structures are required some (to 128-256 in present GPU) kilobyte.
And in itself work with memory at GPU and CPU differs a little. So, not all central processors have the built in controllers of memory, and at all GPU usually is on some controllers, up to eight 64-bit channels in chip NVIDIA GT200. Besides, on video cards faster memory and as a result the big throughput of memory is accessible to video chips in times is applied that also it is rather important for the parallel calculations operating with huge data flows.
In universal processors considerable quantities of transistors and the chip area go on buffers commands, a hardware prediction of branching and huge volumes on chip caches-memories. All these hardware blocks are necessary for acceleration of execution of not numerous streams of commands. Video chips spend transistors for the files of executive blocks operating streams blocks, divided memory of small volume and memory controllers on some channels. The aforesaid does not accelerate performance of separate streams, it allows the chip to process several thousand the streams simultaneously executed by the chip and demanding high throughput of memory.
About differences in caching. Universal central processors use a cache-memory for productivity increase for the account of decrease in delays of access to memory, and GPU use a cache or the general memory for pass-band increase. CPU reduce access delays to memory by means of cache-memory of the big size, and also a prediction of branchings of a code. These hardware occupy the most part of the area of the chip and consume a lot of energy. Video chips bypass a problem of delays of access to memory by means of simultaneous execution of thousand streams – when one of streams expects the data from memory, the video chip can carry out calculations of other stream without expectation and delays.
There is a set of distinctions and in multithreading support. CPU executes 1-2 streams of calculations on one processor kernel, and video chips can support to 1024 streams on each multiprocessor, which in the chip some pieces. And if switching from one stream on another for CPU costs hundreds steps GPU switches some streams for one step.
Besides, the central processors use SIMD (one instruction is carried out over the numerous data) blocks for vector calculations, and video chips apply SIMT (one instruction and some streams) to scalar processing of streams. SIMT does not demand, that the developer transformed the data to vectors, and supposes any branchings in streams.
In brief it is possible to tell that unlike modern universal CPU, video chips are intended for parallel calculations with a considerable quantity of arithmetic operations. And it is considerable more number of transistors GPU works on direct appointment – processing of data files, instead of operates execution (flow control) not numerous consecutive computing streams. It is the scheme of that, how many places in CPU and GPU are occupied with the various logic:
|CPU and GPU|
It is known that the parallel architecture is capable to provide repeated growth of productivity in comparison with classical architecture CPU far not in all cases. First of all, it is the differential equations, calculation of a drawing, hydrodynamics etc. The Important element presence of a correct program code which will allow as much as possible is “parallelize” carried out problems. And if all these conditions are observed, the system effectiveness increases repeatedly.
The professional 3D-drawing is in many respects not only creation of a digital 3D-prototype. It also a number of specific problems by calculation of dynamics of physical processes, photorealistic visualisation and others for which decision special professional graphic accelerators are created.
Modern professional video card – the decision consisting of the hardware – a graphic payment – and a set of the unique drivers optimised for performance of design works with professional appendices. In manufacture of professional video cards and drivers for them the accent becomes on reliability and long trouble-free work. Therefore they in times it is more reliable than game video cards, and drivers for them will be certificated by manufacturers of professional appendices.
The most interesting hardware-software decisions in this area are given today by company NVIDIA with the ruler of professional graphic cards NVIDIA Quadro FX.
Here the list of the primary goals which are capable to solve such cards at performance of design works:
- support CUDA (English Compute Unified Device Architecture) – environments of the working out, allowing to write a software for the difficult computing problems which have been not connected with a drawing;
- fast and qualitative graphic rendering in CAD-appendices;
- hardware acceleration of physics by forces of cursor NVIDIA PhysX;
- support of the 3D-image;
- system of calculation of trace of beams OptiX.
As we see, the spectrum of solved problems is widest and is not limited only to drawing calculation. We will try to stop on each of the listed points in more details.
NVIDIA Quadro FX is a series of the professional graphic cards initially created for calculation schedules in 3D-appendices, such as Autodesk 3ds Max, AutoCAD, Autodesk Inventor, Autodesk Revit etc. Practically all models of professional video cards NVIDIA provide much higher speed of graphic rendering, rather than usual user video cards. Thus it is not necessary to forget about quality of the image and service life. These indicators at professional cards also it is considerable above.
Hardware physical cursor PhysX has won for a long time already to itself a place in video games where in real time it is necessary to count explosions, fire, water, destruction of objects etc. But this technology has appeared the extremely useful and for professionals. In particular, PhysX as a plug-in it is actively used in Autodesk 3ds Max and Autodesk Maya and allows to create simulation of fabrics, fluids, firm and soft bodies.
The plug-in can be downloaded free of charge on site NVIDIA.
Function stereoscopy-technology which last years has received the second breath. Many studios creating 3D-animation, besides the usual version without fail let out the stereoversion for cinemas specially adapted for this purpose. Company NVIDIA had been developed points which allow to work with 3D-models in a stereomode. The work principle is based on the gate mechanisms which have been built in points that gives the chance to transfer the image in high-grade colour scale unlike simple two-coloured points.
Certainly, volume 3D-models in specific windows allow to be guided better in space of a created scene. Creation of a stereoscopic movie without similar technology is represented at all inconceivable.
System of calculation of trace of beams OptiX – technology which provides possibility to do final photorealistic renderer your project in times faster! It is enough to tell that the well-known division Lucas Films for creation of special effects, company Industrial Light and Magic, on which to the account visual effects for films “Pirates of Caribbean sea” “The man of iron” “Transformers” “Indiana Johns” etc., at the moment builds a new render-farm on the basis of GPU NVIDIA.
It is necessary to notice that technologies of realistic visualisation are not limited only to technology OptiX. Rather recently company NVIDIA has got the developer render mental ray – the company mental images. After teamwork on a software joint mental images and hardware NVIDIA release of a new unique product under the name iray has been announced. Developers assert that it is united with renderer mental ray which is used in software products of companies Autodesk (including in AutoCAD), Dassault and PTC. Very soon we can Already count our projects means mental ray, having involved thus GPU NVIDIA, instead of the central processor which, as a matter of fact, is not intended for such calculations. In the press release nVidia informs: “The technology of rendering iray will be included in a package mental ray 3.8 and will appear in the market in the end of November, 2009 without an additional payment for today's clients and software producers”. Iray much faster at rendering (“minutes” instead of “hours” according to John Peddi (Jon Peddie), president Jon Peddie Research). But for it the new graphic payment nVidia as it works only with GPU and in architecture of parallel calculations CUDA is required.
Some words it is necessary to tell about technology CUDA which is supported by all last generations of graphic cards NVIDIA. CUDA is an architecture which allows to use power of the graphic processor for general purpose calculations. As a matter of fact, it is the working out environment, allowing to carry out any calculations to which the parallel architecture of the processor is preferable. At present architecture CUDA supports programming languages C, C++, Fortran, and it not a limit. Platform CUDA gives practically anything unlimited possibilities on use of resources GPU for calculations of any complexity.
Technologies are not at a stop, is faster, on the contrary, last years we observe them exponential growth and occurrence of new directions of development. Today the graphic card in the computer of the expert on 3D is not only the adaptation increasing frequency of change of shots at work with difficult 3D-models, but also the means accelerating work with a 3D-drawing practically in all its variety.
|Growth of productivity GPU in comparison with CPU|
The hardware-software architecture presented by company NVIDIA for calculations on video chips CUDA well approaches for the decision of a wide range of problems with high parallelism, in particular, for accelerations of work with a 3D-drawing.
CUDA works on a considerable quantity of video chips NVIDIA, and improves model of programming GPU, considerably simplifying it and adding a considerable quantity of possibilities, such as divided memory, possibility of synchronisation of streams, calculations with double accuracy and integer operations.
Even such prerogative CPU of last years as final renderer, now it is successfully enough shifted on “shoulders” graphic processors that practically everywhere leads to repeated increase of speed of calculation of the final image or video.
Autodesk optimises the products under CUDA. In particular, summer of 2009 company Autodesk declared optimisation of appendix Autodesk Moldflow 2010 for technology CUDA. Result – acceleration of work more than in 2 times. Autodesk Moldflow is the program decision helping designers to define and optimise behaviour of injection plastic components on various design stages and manufacture. Hardware support Autodesk Moldflow is carried out by professional graphic cards of an ultrahigh class – nVidia Quadro FX 4800 and Quadro FX 5800.
“The maximum use of ample opportunities and calculations with mass parallelism of professional graphic processors NVIDIA Quadro gives to our users a notable gain of productivity” – Eric Stouer (Eric Stover), the manager of company Autodesk on production AutoCAD has informed. – “In AutoCAD 2010 we have improved possibilities and 3D-modelling functions, and unsurpassed quality of 3D-models which is provided with graphic processors Quadro, will give to all designers appreciable advantage”.
Except increase of productivity, advantage from sharing of decisions Quadro and AutoCAD 2010 include:
- Better quality in the class – Processors Quadro provide the best parity the price/productivity for drawing processing on workstations, providing an optimum combination of quality, accuracy of calculations and productivity. Decisions Quadro are developed, made and tested NVIDIA and correspond to the highest quality standards.
- Unsurpassed productivity – Processors Quadro to 5 times raise speed of work in a mode 3D Hidden and accelerate management in real time in modes Conceptual and Realistic.
- Easy work with 3D-models – Processors Quadro allow to operate and co-operate easily with large-scale models, keeping the maximum quality of a picture necessary for exact calculation of 3D-models with a considerable quantity of ranges.
- The highest quality of the image – Processors Quadro provide the raised quality of the image without damage to productivity thanks to function of rendering of the smoothed lines in AutoCAD, the special function Quadro accessible in drivers NVIDIA for increase of productivity AutoCAD.
- Professional support of several screens – Software NVIDIA nView® provides a maximum of possibilities and at use of one big screen, and at use the several permission of each screen can reach 2560 x 1600.
It is necessary to notice that processors Quadro of any price category unite the advanced possibilities of processing of a drawing and the special drivers raising productivity in AutoCAD 2010 that allows designers to create more difficult 3D-projects and to operate them.
Obviously, in the near future will appear even more functions AutoCAD under control of a payment nVidia Quadro and processor CUDA.
Also company ASCON together with nVidia has informed on acceleration of work KOMPAS-3D on the computers equipped with professional solutions nVidia Quadro FX. Profile application in driver Quadro will provide to users KOMPAS-3D essential acceleration of work with 3D-models at performance of operations of rotation and positioning. Depending on complexity of model acceleration can make from 30 to 50%!
“Increase of speed of work of users – one of the major problems whom we solve from the version to the version, – was underlined by Oleg Zykov, the head of department of grocery marketing ASCON. – Certain reserves for this purpose contain in optimisation of the product, in new functions of simplification of work with assemblages. However now we will use to the full and the most powerful hardware resource – professional video cards”.
In turn, Alexey Lagunenko, the head of department of sales nVidia in the Eastern Europe, has noted: “nVidia actively co-operates with software developers all over the world and it is always ready to support innovative ideas and perspective products. ASCON – one of leading Russian developers mass CAD/AEC/PLM decisions, and we are glad to give to a wide range of users of products ASCON support both at level of professional hardware decisions, and from the point of view of optimisation of software”.
Thus, professional solutions NVIDIA for acceleration of calculations 3D-graphics raise efficiency of performance of design works.
1. Professional solutions NVIDIA for acceleration of work with a 3D-drawing//CAD and a drawing. – 2010. – #2. – P.42-43
2. Berillo A. NVIDIA CUDA – not graphic calculations on graphic processors. – 2008 [http://www.ixbt.com/video3/cuda-1.shtml]
3. Processors nVidia Quadro lift 3D-design on new height for users AutoCAD 2010. – 2009 [http://www.nvidia.ru/object/io_1242377419521.html]
4. ASCON and NVIDIA accelerate work KOMPAS-3D. – 2009 [http://ascon.ru/press/news/items/?news=582]
The author: Челябэнергопроект
Comments of experts of Челябэнергопроект: