AI landing, hardening is not easy

01

surprising new product

In order to let the algorithm land, artificial intelligence companies are also racking their brains.

In August, SenseTime, the leading AI company in China, released a surprising new hardware product——Yuanluobot Robot. This is an AI chess-playing robot with a price of 1999 yuan. Just like its name, it can only be used to play chess, but it simulates human movements, implants AI algorithms, and can set different levels of difficulty.

The design of this product looks more like Xiaomi’s style. It has a small robotic arm and two shoebox-sized white bodies with a screen on its forehead and a camera on its head. According to people familiar with the matter, SenseTime chose to make an AI chess-playing robot, which was indeed influenced by people with Xiaomi background in the company. As for why we make chess and not Go robots, the above-mentioned sources revealed that the main reason is that the board and pendulum of Go are much more complicated than that of chess and are not easy to identify.

Even so, Shen Hui, dean of SenseTime’s Institute of Science and Technology Innovation Engineering, revealed that the Yuanluobot robot has undergone 20 months of research and development and has iterated 9 products.

However, many people in the industry expressed incomprehension to this product of SenseTime. Some people think that although SenseTime has always been a large and comprehensive AI company in the impression of the outside world, it is unexpected to directly make a toC consumer product. Others believe that even if you are doing toC hardware, you should find “explosive products”, and the chess-playing robot is a rather unpopular hardware category with little potential. There is only one product of the same type on Tmall.

“I think SenseTime is selling at a loss.” In the opinion of an industry insider, SenseTime is a robot with a complicated structural design, and SenseTime has no C-end channel advantages. In addition to various marketing costs, it is difficult to make money. to money. SenseTime even invited Guo Jingjing to endorse.

In his opinion, the cost of a color screen and a robotic arm is not low, although the method of magnetic attraction is used instead of simulating human hands to grab chess pieces, the latter will cost more.

However, more doubts come from whether Shangtang is necessary to make such a product:

“Head Vision Company does this, why not download a QQ game!”

“It is recommended that product managers think more humanely when making products. The old man went out to play chess, essentially for the purpose of going out. Now he is dragged at home by the robot, and the opportunity to go out to let the wind disappear.”

Xu Li, CEO of SenseTime, explained the original intention of this product, which is to bring the AI capabilities of large-scale industries into thousands of households, and to experience and interact in real scenes. “Technology can give traditional culture a future.” As Xu Li said, his father was also a chess fan, but when he got older, he had no other chess playing partners except playing chess online, and playing chess online hurts his eyes, and he couldn’t see clearly when he got older. So very little.

It seems that there is an intention to let visual artificial intelligence products influence more ordinary people, but if we analyze the voice artificial intelligence veteran iFLYTEK to make toC hardware, their goals are clear: first, to find the explosive product market, and iFLYTEK is responsible for it Hu Yu of toC hardware said that the reason why he chose earphones was because a few years ago, the earphone category had reached a market size of 100 million shipments; the second was to run through the industry chain, such as choosing educational products, taking into account content organization and supply behind it. .

However, the scene currently chosen by SenseTime is not favored by many people. Zhang Binquan, an AI practitioner, told Data Intelligence Frontline that he can only play chess, and the functions are too single. He once bought many robots and robot dogs for his children. More than 20 functions were finished in three days, and the novelty soon lost. What consumers want is richness, playability, and fun, “a thing with a single function, after playing for a day, you will stop playing it.”

Some people think that it is a better choice to make a desk lamp with “children in charge” and “children’s interaction” similar to ByteDance’s. Even a sweeping robot that is more in line with the visual algorithm is a more demanding and huge track.

An educational hardware practitioner told Digital Intelligence Frontline that products with built-in AI capabilities, including learning machines, reading pens, and word cards, have sold very well in recent years. “Vocabulary cards are sold very cheaply, and the quantity is also very large. It can easily be a product of the order of 100,000.”

Although education and training companies have suffered double-cutting impacts in the past two years, an obvious trend is that hardware is one of the few online education companies that is still working hard. Obviously, educational hardware is just what many parents need, and its ultimate goal is to improve student performance, but in comparison, the demand for playing chess or Go will be significantly smaller.

In addition, SenseTime has also made educational programming cars before. Although it is a toB product that SenseTime cooperates with schools, SenseTime can also find another way to promote toC educational hardware with visual functions like DJI and Lego. It is also an option. .

One week after the pre-sale of SenseTime’s Yuanluobot robot, the sales volume on Tmall has only exceeded 100 orders, and it will take two months before it can be shipped. For a toC hardware product, it is not very reasonable to take such a long time from release to delivery.

ToC products are also very different from the ToB business of these AI companies. The former tests supply chain capabilities and sales channels, and SenseTime has not established a complete channel for toC products. An industry insider told Data Intelligence Frontline that SenseTime’s Yuanluobo robot may use ToB as one of its channels, such as selling these products to Qiyuan, just like HKUST Xunfei sells translators to the government. In fact, this product of Shangtang has also carried out in-depth cooperation with the Chinese Chess Association.

02

AI companies are keen on hardware behind

Although there are few AI vision companies that directly make a toC hardware like SenseTime, there is a consensus in the industry that it is difficult for pure technology companies to make money, and technology must be combined with specific scenarios.

In recent years, making hardware or integrating software and hardware has always been the direction of AI Xiaolong’s efforts. They make some boxes equipped with AI algorithms or panel machines with cameras and facial recognition.

Among the AI vision companies, Megvii Technology was the first to lay out hardware. In 2015, Megvii launched the first AI camera, “with the mentality of trying it out.” Yang Mu, co-founder of Megvii Technology, said. At that time, Megvii tested various hardware to find a carrier suitable for the algorithm. At the beginning, everyone’s perception was that the algorithm carrier was the server. But the server is too expensive and consumes a lot of power, so users have to build a computer room. At that time, the access control of some residential areas and parks had the requirement to put the algorithm on the terminal camera.

In 2017, SenseTime also released its first self-developed smart hardware product, the SenseID ID verification machine, which is used for face recognition in airports and other scenarios. In March 2018, Yuncong released an AI camera, which began to integrate software and hardware. In May 2019, YITU launched its first chip “Qiaosuo” through acquisition, and installed it in smart servers and smart edge computing devices.

Behind the transformation of major AI companies to the integration of software and hardware, it is difficult to go down the business model that relies solely on algorithms in the early days.

Liu Ruoshui, co-founder of Extreme Vision, told Data Intelligence Frontline that there was no shortage of money in the early stage of the market, and a large amount of venture capital poured in, which made AI companies pay little attention to cost control. Everyone was hiring people with high salaries to work on algorithms. If done well, there will be a market.

But at this stage, the outside world is more concerned about whether the algorithm itself can be implemented and whether the business model is feasible. Moreover, compared with software products that can be continuously copied, algorithms still require a large number of customizations, which also makes it difficult to reduce costs.

Yao Zhiqiang, co-founder of CloudWalk Technology, told DataSmart Frontline that single modularization and single-point technological innovation at the application level cannot meet the needs of increasingly complex subdivision scenarios, and AI companies urgently need to enhance their ability to provide complete solutions.

Another reason is that for large customers with technical capabilities, algorithm companies are more able to achieve lower-level docking, but for small and medium-sized companies, they often need to provide more complete things. “If you give him an algorithm, it will be difficult for him to access it.” Yang Mu said.

“SDK is actually a very efficient cooperation model.” Yang Mu said, the premise is that the partners know what they need, such as face unlocking and face payment, etc. Mobile phone manufacturers and payment companies themselves have strong capabilities and clear Algorithm companies only need to provide SDK interfaces.

But the problem is that in many intelligent scenarios, AI algorithm companies often see opportunities first, and have to test the water and cultivate the market by themselves. In this context, the advantages of the integration of software and hardware are highlighted.

Liu Ruoshui told Data Intelligence Frontline that most of Extreme View’s income comes from software algorithms, and hardware is collected by major customers themselves. “Each algorithm needs to find a particularly suitable hardware for adaptation, and the input and output of this matter is not particularly high.”

However, small and medium-sized customers do not understand hardware at all, and the scene is relatively simple. They directly provide products integrating software and hardware, but they can be quickly deployed and shipped. For example, bright kitchen and bright stove, smart construction site and smart safety supervision, etc., are all scenes with relatively mature visual algorithms. “The algorithm is relatively fixed, and the number of cameras that need to be identified in a scene is also relatively fixed.”

“Those who only do AI algorithms or AI software, and don’t go to the user side for data and applications, or don’t go for chips and hardware, I think they will die in the long run.” Zhang Binquan said. In the early days, the proportion of hardware cost was relatively low, the algorithm could be sold at a high price, and the face recognition all the way could be sold for tens of thousands.

But now the price of all-way AI face detection is only a few tens of dollars, and the price of the cloud is only a few hundred dollars less. In this case, the proportion of hardware is extremely prominent.

From the perspective of capital, pure algorithm revenue cannot support a huge valuation. “I want to increase this revenue, because pure soft income may not be enough.” An employee of an AI algorithm company said that although the algorithm has a high gross profit, the overall plate is not big, which is not conducive to telling stories in the capital market. “Purely selling algorithms, one or two billion a year is already a lot.” But one or two billion in revenue obviously cannot support the tens of billions of market value of AI companies.

Moreover, as the threshold of the algorithm becomes lower and lower, the price is no longer expensive. An algorithm sells for 50,000 yuan, but if the server is added, it may be sold for 150,000 yuan, which is equivalent to tripling the revenue.

03

It’s not easy to “harden”

But making software and making hardware requires completely different capabilities behind it.

“Algorithms need more advanced talents, algorithm engineers who understand business implementation, and hardware needs the advantages of channels and supply chains.” Yao Zhiqiang said.

“Hardware is almost always cost-cutting, but when it comes to software, a powerful algorithm can outweigh thousands of troops.” An AI industry person who often deals with the supply chain told Shuzhiqian that there are many pitfalls in the hardware, “supply chain It’s a disaster area.”

She still remembers these lessons vividly. In the past, the bosses of the Shenzhen factory would pat their chests and assured her that there would be absolutely no problem. “. For example, failure to deliver goods when due, cutting corners, etc. Even if the factory is held accountable afterwards, the missed sales opportunity is a heavy blow to the company.

She once met a customer who made a toy with a built-in 4G memory card and purchased an IP license from a chip design manufacturer. However, she did not expect that the assembly factory used second-hand materials for him, which caused frequent freezes. As a result, the items that were sold faced a lot of after-sales, and those that were not sold also needed to be returned to the factory for redoing.

Before 2018, Megvii has been looking for a more suitable hardware carrier. At that time, I bought a bunch of vision-related hardware on the market, and studied whether the algorithm scheme could be put on it. “The main work is in the selection of various hardware and chips, and mature chips on the market have been used.”

But the tuition fee was not reduced, and a lot of products were made but could not be sold. Moreover, the hardware development cycle is long, and the management is relatively trivial, involving inventory, supply chain control, plant selection, etc. “This is an industry know how that takes time to accumulate.” Yang Mu said.

For example, if you buy a bunch of chips, how to store them, and whether they will oxidize in the warehouse. When defining a product, even if one interface or communication protocol is missing, the board may have to be redesigned.

“Whether the sensor is front-illuminated or back-illuminated, whether to add the HDR function or not, all need to be chosen. Once the choice is wrong, it doesn’t mean that it is useless, but that your pricing does not match the market demand.” Yang Mu. explain.

In order to verify the feasibility in the early stage of the market, Megvii looked for an external team, and it was relatively easy to pay for cooperation. However, with large-scale deployment, the ease of integration, ease of delivery, and cost become more critical. “Once cost is concerned, hardware design becomes critical.”

The design of the hardware determines the most critical point of the success of the entire product. After 2018, Megvii’s hardware R&D team began to gradually take shape. At that time, there was a background that the division of labor for hardware was already very clear. To do hardware, you don’t need to start with making chips. What you need to do is focus on specific scenarios, taking into account the versatility of the product and the difficulty of delivery.

For example, in the access control scene, accessing cameras with different lines means that different memories are required. In the past, servers like Nvidia were often a standard product and expensive. However, if at the beginning of hardware design, how much memory is required for how many algorithms are calculated, the final hardware product will have a very good cost advantage.

This is also the algorithm-defined hardware concept advocated by Megvii. First understand the needs of the scene, and then design a hardware product with a price advantage according to the needs of the algorithm. For example, around 2015, it was generally believed that the best carrier of the algorithm was the server. But in fact, the server is expensive, consumes a lot of power, and needs to build a computer room, so it cannot be deployed in various places. Therefore, Megvii made panel cameras and network cameras during that time.

Moreover, simply focusing on hardware manufacturing costs may not necessarily achieve the best results. If hardware manufacturing costs are appropriately relaxed, but software costs are greatly reduced, the total cost will be better.

There is also the product development cycle. A senior person from Dahua told Digital Intelligence Frontline that both Hikvision and Dahua are fighting for speed, and the products are launched quickly. The lead and control cycle of a product in terms of indicators can only be achieved for 6 months.

In addition to these, hardware sales channels are also a big challenge for AI companies. For example, HKUST Xunfei has established a “CBG” channel system. Traditional hardware manufacturers such as Hikvision and Dahua have sunk into urban and rural areas. Megvii has also proposed to cooperate with operators. After all, operators have the largest sales and operation networks in China. .

In Zhang Binquan’s view, deep learning technology, especially visual deep learning, has reached a bottleneck period, and the scenarios that can be implemented are almost excavated, and there are not many real commercialization points. “Everyone thinks it is a new wave of artificial intelligence, but It didn’t make a few waves.” In addition, the homogenization of hardware is getting higher and higher, which also means that the competition will eventually come to the competition of channels and prices.

SenseTime’s first attempt in the direction of toC hardware, although it provides a new way of thinking, it does not seem to have much sales and volume so far.

Yao Zhiqiang believes that artificial intelligence is more suitable for the professional market at present, because it is not yet a fully generalized product. Consumer-grade products require cost-effectiveness, and artificial intelligence is not fully generalized, so it is difficult to reduce the price to a very low level.

At the moment when the integration of software and hardware has become an industry trend, several leading AI vision companies have embarked on different paths. SenseTime is engaged in large-scale models and large-scale devices, and is trying almost all businesses; Megvii focuses on AIoT, and chooses several key categories in consumption, cities, and supply chain scenarios; Cloudwalk is inclined to human-machine collaborative operation System, self-developed part of the hardware is mainly to establish a benchmark case, and the purpose is to attract partners. No matter which way, at present, it is still on the way.