Chapter 513: A Brick

128 cores sounds like a lot, but that's exactly what it is.

It's just the core number of 11 Allwinner A31SOCs, after all, they are the legendary "four cores and eight displays 12 cores"!

Speaking of which, GPUs have a tradition of being equipped with a large number of concurrent pipelines because they need to complete a large number of image rendering tasks in the shortest possible time and pay more attention to parallel computing capabilities. For example, A31's eight displays,It's said that there are two SGX544GPU modules inside,Each has 4 rendering units called universal expandable shading engineuscolor2,As a result, the chip that was originally just a dual GPU was blown into eight displays by profiteers, and it was a little covered up at first,Only said that it was "eight displays"Not to say"Eight core graphics cards",Later, it was completely released。

As the female IDIA with the strongest display performance in the mobile SoC, their latest tablet SoC product, Tegrak1, has become a 192-core graphics card against the sky, and it seems that it is about to beat the desktop graphics card.

In fact, this is not the case, the thermal design power consumption of the entire K1 is only 5 watts, and there is no possibility of comparison with the desktop graphics card for the time being, at least only 96 CUDA cores of the Shuangmin Big Crazy Bull GT440-4GB can easily kill it in seconds.

The Crazy Bull series graphics cards have always been known for their scary-looking ultra-large video memory and inpowerful low-end GPUs, which have become a weapon for profiteers to deceive consumers, and the weak core cheap and low-speed DD graphics memory has become a weapon for hardware veterans to ridicule manufacturers and white users. However, only the owner of the gold-striking studio who laughed and did not say anything knew that "a good horse with a good saddle, and a crazy bull for gold-strike", this card is actually very good.

For ordinary people, at the speed of GF108GPU, under normal circumstances, 4GB of video memory cannot be used up at all. But for online game studios, this kind of low-cost large video memory graphics card is simply tailor-made for it, they don't care about the picture quality or anything, anyway, in order to smooth and open more are the lowest image quality, at this time, this kind of graphics card that uses memory particles as low-speed video memory is very practical, and there is no problem in holding more than a dozen or even dozens of online game clients.

It's a pity that now after all, it's the world of free online games, and these games generally have a common problem of unhealthy economic systems and very fast depreciation of props, so that many studios have chosen to change careers and play computing power mining instead. Compared with the online games that need to be taken care of by the younger brother, the popular computing power mining is even simpler, as long as the network and power are kept unobstructed, the computer can be turned on to make money for the boss.

The two most famous computing power mining brands in China are the most popular foreign coin Bitcoin, and the other is the bee P coin, or bee food stamps. Unlike Bitcoin, which has a price that goes with the market, the price of Bee P has maintained a steady and declining trend, although it can't make a lot of money, it will not lose money because of the price fluctuations like Bitcoin, and become the target of speculation.

However, whether it is mining bitcoin to earn dollars, or following bees to make some electricity bills, there are relatively high requirements for computers, especially the computing power of GPUs, especially bitcoin mining, at the beginning of all kinds of high-end graphics cards are mining, and they are sold at a discount when they are quickly scrapped. The GPU, which is as weak as a mad bull, naturally does not have the life of being a "mining card". So, Shuangmin, who sells cattle for a living, died.

Whether it is the SP stream processor of A, or the CUDA universal parallel computing unit of N, or the x86 core of Intel's Xeonphi, they are not separate "cores". In terms of the complexity of the structure, it cannot be compared with Atletico Madrid's IPU. After all, the first three must consider the problem of mass production cost, and too complex circuit design will lead to lengthy production processes and low fault tolerance, which is naturally not done by wise men.

In fact, Atletico's IPU is more like a powerful version of Intel Xeon processors.

Intel released in the second half of the year, the most powerful mass-produced CPU on the planet Xeon E5-2699V3 has 18 physical cores, each core exclusively enjoys 32+32KB high-speed L1 cache and 256KB L2 cache, and then collectively shares "up to" 45MB of L3 cache.

Of course, chip designers know the bottleneck of von Neumann, so they began to configure the CPU with cache cache very early, and generally use SRAM static random access memory, compared with the DRAM dynamic memory commonly used in memory modules, the former has the advantage of not needing to set up a refresh circuit, fast read and write speed, and the disadvantage is that the circuit integration is low and the cost is high.

Whether it is high cost or low integration, it is fatal for the CPU, so it is not difficult to explain why to this day, the capacity of the CPU's L1/L2 level 1 and 2 cache is still in KB, but the L3 cache using DRAM can be made a little bigger but not too big, after all, every millimeter of chip area is precious.

Contemporary U Emperor 2699 all three-level cache adds up to only 50MB capacity, which is naturally not enough for applications, when the data is not found in the cache, the system still has to go to the memory or even slower hard disk to search, which naturally further slows down the speed.

And these problems are basically not a problem for Atletico. The specific model of the first-generation IPU chip is 128-16/16, which naturally refers to 128 computing cores, each of which is allocated 16MB L1 cache, and then collectively shares 2GB of 3-level cache, that is, the average core is 16MB.

The reason why this can be done is that someone has mastered black technology and can skillfully manufacture carbon-based chips, the only problem is that the output is too low due to "artificial production", and secondly, because the design idea of the IPU is completely new, whether it is CPU or GPU, as well as the animal nervous system, it is the object of reference. For example, the HBM high-bandwidth memory technology, which is currently a hot topic in the field of graphics cards, has been borrowed by Atletico Madrid.

Since the memory is based on the 3D stack, the processor part can no longer be flattened. In fact, Atletico Madrid is more radical than Intel and Samsung in terms of 3D transistors, after all, people have to consider the process implementation problem, while Atletico Madrid can open their minds casually, and failure is just a waste of a few days.

Compared with the nervous system of animals, the 128 computing cores of the IPU with their own "memory" ability are not much, but they are basically enough. Atletico have tested different numbers of schemes before, and in general, the fewer cores and the less "memory", the worse the effect, and vice versa. However, with the accumulation of the number of cores and the expansion of "memory", the number of transistors increased rapidly, and the thickness, area and heat generation of the corresponding chips also increased, and finally the 128-16/16 scheme could only be compromised.

The human cerebral cortex is less than 3 mm thick on average, but it is full of folds and grooves, with a total area of up to 2,200 square centimeters when fully unfolded, of which it is estimated to contain about 14 billion nerve cells. And the number of human brain cells is decreasing every day, and it is also estimated that 100,000 brain cells die every day and will not be replenished. Fortunately, the self-correction ability of the "biological brain computer" is much stronger than that of an electronic computer, and the dead brain cells will be quickly replaced by other cells, and usually will not give you a blue screen of death at every turn.

Obviously, the complex connections between neurons play a very powerful role in redundant backup, and the brain, as a living system, although it is difficult for neurons to proliferate, can establish new synaptic connections to automatically adapt to a variety of different situations, which is usually the case with the more the brain is used, and the same is true for the spontaneous changes in the patient's brain. Of course, if the brain is overused, or if the disease progresses too quickly and severely, beyond the scope of the brain's autonomic regulation, various problems will still occur.

When Atletico designed the stereo transistor architecture of the IPU, it deliberately imitated animal neural networks, firstly, to increase the density of transistors with the help of stereo architecture, and secondly, to build a "neural network" in the chip.

This network is still very simple and crude, and it cannot be compared with the computer neural networks built by the major giants when they are studying artificial intelligence, but Atletico is very optimistic about its development, after all, the speed and energy consumption of communication within the chip are better than that of communication between computers through network cables.

Especially when used in some "simple" situations, the effect is even better than that of large systems. Now the animals running around in the bee garden can already achieve the point of "walking next to the rabbit, and An can distinguish between me and the truth", and the manual customer service positions in the bee customer service center have not increased but decreased. As for other artificial intelligence applications, they also have excellent performance, whether it is speech recognition synthesis, semantic recognition, machine translation, image recognition synthesis, fuzzy computing, AI simulation, etc.

On the one hand, the output is too low and can only be used for its own use, on the other hand, because the number of transistors is too large, the overall energy consumption is also a bit large, and it cannot be put in mobile devices for the time being. Atletico Madrid is not in a hurry to stuff the IPU into the body of an artificial animal and run to the pet to grab the market, what he really cares about is to use the IPU in the next generation of PT2, pony electric vehicles, and Pegasus aircraft.

Of course, game consoles require a lot of artificial intelligence and human-computer interaction technology, and the latter two as intelligent transportation equipment also need more clever brains. However, it is a pity that the current lithium-ion battery capacity is not powerful enough, and the battery life will become very poor after installing an IPU with a power consumption of more than 100 watts.

The aforementioned most powerful CPU on the ground, the Xeon E5-2699V3 integrates approximately 5.7 billion transistors, has a core area of 662mm, and has a thermal design power consumption (TDP) of 145 watts.

TDP is a safe value, which is used by chip manufacturers to indicate the maximum heating degree of their own chips, so as to refer to other related manufacturers to avoid accidents such as overheating and even melting deformation of the system without giving force to the radiator. Therefore, TDP is usually larger than the maximum power consumption of the chip itself, and most of today's mainstream chips are equipped with frequency reduction and energy-saving technology, and the actual operating power consumption may only be one-third or even lower than TDP.

Because the size of the carbon tube is only about 5 nanometers, Atletico Madrid has about 20 billion transistors in the IPU, but even so, the average number of transistors per core unit is only 160 million, not to mention compared with IntelX86CPU, even compared with ARM mobile processors, it is at a relatively low level. For example, the latest Apple Core A8 is a combination of dual-core CPU and quad-core GPU, but it has 2 billion transistors, with an average of more than 300 million transistors per core.

Of course, this comparison is too rough, because the mobile phone processor in reality is SoC, not only to integrate CPU, GPU and SRAM cache that takes up a lot of space, but also to free up a large area for professional processors such as DSP, ISP, etc., the area used in CPUGPU is actually quite limited, and on the whole, the number of core-average transistors naturally cannot be as fierce as desktop CPU.

And the more transistors, the more heat will inevitably be generated. Even with carbon transistors with lower resistance, the power dissipation advantage of advanced materials is flattened by the large number of transistors, which is one reason why only 128 cores are achieved.

When Wei Wei finally saw the long-awaited IPU, he couldn't help but ask again and again in surprise if he had made a mistake?

Because the IPU that Atletico Madrid gave him was not a chip as he imagined before, but an expansion card with PCI-E gold fingers, in short, it was a brick that looked very, very much like a high-end graphics card.

Occupying most of the thickness of the AI card is actually a water-cooling system made of aluminum-magnesium alloy, not only the inner chip is covered and invisible, but even the onboard memory/flash memory is invisible, although there is a total of up to 4GB of chip memory, but Atletico is afraid that it will not be enough, but Atletico still piles 8GB of DD memory and 128GB of flash memory chips on the board.

Although the all-metal shell looks quite technological and futuristic, the graphics card manufacturer has already done this, and the appearance of this AI card is too much like a graphics card, so that Wei Wei took it in his hand and looked at it a few times and discarded it.

The main reason is that the previous expectations were too high, and it was inevitable to be a little disappointed when I saw the mediocre real thing.

Of course, this also has something to do with the fact that he is not a Kabakike, otherwise he will definitely pull the score, disassemble the machine, take pictures, and then use it to post and break the news.

Yes, after putting down the AI card, he immediately shifted his interest and asked another topic: "This time the three horses are finally going to get together, right?"

"It should be," Atletico said indifferently, "What is there to look forward to, as senior third-season kings, we are going to play soy sauce." For mobile phone users, please visit http://m.piaotian.net