Baichuan Intelligent Wang Xiaochuan: After 100 days of starting a business with a large model, I have confirmed that I have found my "no man's land"

2023-07-22 08:41:18

**Source:**FounderPark

In April, Wang Xiaochuan announced his end and established a large-scale model company "Baichuan Smart".

In June, Baichuan Intelligent released a 7 billion parameter open source large model, Baichuan-7B.

In July, Baichuan Intelligent released a 13 billion parameter open source large model, Baichuan-13B.

Both models have performed well in various evaluations, and have also been recognized by a large number of users at home and abroad.

"Open source is very simple, everyone can use it for their own use, this is not something you brag about yourself."

He moves fast, faster than most people think. Even he himself was surprised: the progress will be much faster than expected after starting to do it.

At the AGI Playground conference, Wang Xiaochuan revealed that the open source model capabilities of BCI will definitely surpass Llama 2. According to the plan, Baichuan Intelligent will release tens of billions and hundreds of billions of closed-source large models in the future.

A few months ago, Wang Xiaochuan wanted to "be China's OpenAI" on the headlines of various media.

This is a slogan that the public loves to hear, but it may not be an accurate description.

What exactly does Wang Xiaochuan want to do? What is he going to do? After three months of practice and achieved initial results, what is his first-hand understanding of entrepreneurship in the era of large-scale models?

The following content is from the interview between Zhang Peng, founder & president of Geek Park, and Wang Xiaochuan, founder & CEO of Baichuan Smart, at the AGI Playground conference, edited by Founder Park.

Open source, what OpenAI didn't do

Zhang Peng:

What you may be most interested in is that you moved very quickly after starting your business, and released two models: one 7B and one 13B, and the feedback was very good.

Everyone is very curious, because at the beginning they thought that you were going to be a Chinese OpenAI, but seeing that you have made open source things, is open source a technological stage, or is it itself a new model for your thinking in the future?

Wang Xiaochuan:

Open source, we think it is a technical stage, in fact, being China's OpenAI does not mean copying its complete path. When communicating with OpenAI in Silicon Valley, they have ideally gone very far. For example, when they are doing GPT-4 now, they need more than 20,000 cards to do calculations, and they have never seen such a large scale in China.

They are proud to say that they are designing a computing model that connects 10 million GPUs together.

What is the concept of 10 million cards? Nvidia's annual output is 1 million. 10 million copies is a plan that can go to the moon (moonshot).

In addition, in terms of how to make applications, products, and even some broader technologies, this may be a shortcoming of OpenAI, or it may be something they are not particularly concerned about now.

So doing OpenAI in China will be very different from the ecological environment in the United States.

After the recent release of the open source Llama 2, it caused a frenzy in China. Everyone felt that it would change the landscape again, which is what OpenAI failed to do. It's a pity that this is not done by Google, but by Facebook (Meta), and Google continues to be absent. But in China, we have seen this matter in advance, and we believe that open source and closed source will have a parallel state in the future.

Zhang Peng:

Open source and closed source will be in parallel state.

Wang Xiaochuan:

Parallelism is a bit like the current Apple system and Android system. Even when more companies want to participate, it is not enough to use closed-source API calls. This open-source model also needs to provide 80% of the services, and finally rely on closed-source services to provide services for everyone, and the remaining 20% will have a lot of open-source demand. The United States did not have it before, and China did not have a similar model. Therefore, once Llama is announced, it will be a big shock to the United States, but there is also a boom in China. In fact, serious technical evaluations include several mainstream evaluations like SuperClue, as well as comparisons between some major manufacturers and companies. If Llama is compared with Baichuan, it is obvious that our quality in the Chinese field is far superior to that of Llama.

In the field of open source models in China, just like our mobile phone industry, the United States was used at first, and later caught up by itself.

Zhang Peng:

Do you think Llama 2 is so hot recently, Baichuan will do better than them?

Wang Xiaochuan:

Llama is Llama 1 and 2.

First of all, we released the first 7B parameter model in June this year, and released the 13B parameter model in July. Among the indicators in English, the most important indicator is MMLU (Massive Multitask Language Understanding). This key indicator, our two models are better than Llama (1), and are significantly ahead in Chinese.

We know that Llama's Chinese processing is actually insufficient. Compared with Llama (1), Baichuan's English can partially match the benchmark, and the key indicators are exceeded, and Chinese is obviously better. Many people are transforming Llama into Chinese, but it is still not as easy to use as the local Baichuan.

After the release of Llama 2, we can also see the technical report, which contains about 9 technical innovation points, 6 of which have been achieved in the model we are developing.

Compared with other domestic models, our thinking here is the closest at present. We have already done 6 points, and then we did not expect 2 points, and we have one that is invalid. Therefore, when comparing with Llama2, we are not simply plagiarizing in technical thinking, but we have our own thinking. We think that there will be our opportunities in the future in this road.

Today I also appeal to everyone in China not to only think that foreign countries are good when they are using it. OpenAI is indeed farther away now. It will take time to reach GPT-3.5 or close to the level by the end of the year, but we are already very close in the open source model.

Zhang Peng:

So your next open source model, do you think it will be better than Llama 2?

Wang Xiaochuan:

At least in Chinese. In the field of Chinese, it is now surpassed. The next step is to make China's voice in the open source field of the global market.

Zhang Peng:

Both English and Chinese should be better than Llama2, this matter is visible and achievable to you.

Wang Xiaochuan:

I think there is a chance that it may happen in the foreseeable future.

Zhang Peng:

So your point of view is: today we cannot simply say that our future big model is to move towards OpenAI—a closed-source centralized model. Open source actually has great possibilities. So it means that on the one hand, it is to practice technology and demonstrate technical capabilities, but it may indeed contain business models and values.

At the same time, on the basis of open source, it is still something that people can look forward to in China making the best Chinese model in the world.

Wang Xiaochuan:

That sums it up pretty accurately.

Is the search experience a bad inertia, or a good asset?

Zhang Peng:

In the past, many investors believed that using search experience to build a large-scale model would definitely not succeed. After these months of practice, have you verified your original judgment that was different from theirs? How does the accumulation and capabilities of search contribute to larger models?

Wang Xiaochuan:

Because today's (AI) achievement was achieved by OpenAI, but Google did not, the first thought of investors is that this new technology is just the opposite of search. It is difficult for them to distinguish whether the reason is technology or organizational management.

The reason for this kind of voice is that one does not understand the relationship between search technology and AI, and the second is that the cognition of search background will bring negative effects.

Because search companies are mainly Baidu and Google, they don't need external financing, and they don't tell investors what search is. In particular, the last wave of AI boom was mostly brought about by images, and everyone is unfamiliar with the technical connotations such as NLP in search.

In terms of results, we released the first model in June. A competitor told investors at the earliest that it would take half a year for Baichuan to produce the first model. In fact, we only took 1/3 of the time to complete it, and then released the second one. And in the near future, we will release a closed source model.

Baichuan has been self-developed since the first day, and the cold start is very, very fast. What is the reason behind this?

Today we know that high-quality data is the basis of large models, so what company has a broad understanding of language data? In fact, search companies have been working for 20 years, thinking about how to find such high-quality data every day. For example, first find 100 high-quality websites from 1 trillion web pages, and then do page analysis, including information extraction, deduplication, anti-spam, and even extract content by paragraph level.

Sogou, Baidu and Google have been doing this kind of work for a long time.

The second point is that in terms of talent pool, it is necessary to have both algorithmic capabilities and engineering capabilities focused on search capabilities. Such people are basically also searching for companies. Now ByteDance is using the search team to make models, and Baidu's progress is also very fast, including the model that Shen Xiangyang is making is also made by Bing's VP in the past.

Another thing to make a large model is evaluation. Evaluating whether a large model is good or not is actually a painful problem, including the problem of reasoning, precise question and answer, and creation... Some become better, some become worse, how to evaluate? Therefore, this evaluation system is also a capability that search companies have accumulated for a long time, using evaluation to drive the iteration of subsequent algorithms.

In addition, the organizational efficiency of startup companies is much higher than that of large factories. With a very flexible decision-making system, all efficiencies will be maximized.

Zhang Peng:

So did you talk to the investor who thought that search was not good enough for big models?

Wang Xiaochuan:

The name is marked with a cross and removed, and I don't know who it is. Investors who only look at business but not technology, and those who especially like small fresh meat who come back from the United States to start a business, these investors just draw it out and don't talk about it.

Old Wang is right, "Small innovation depends on big factories, and big innovation depends on small factories"

Zhang Peng:

Do you think entrepreneurs will have enough opportunities in this wave of technological change in the future? Or is the main body still controlled by giants? How can entrepreneurs seize their own opportunities?

Wang Xiaochuan:

Although Wang Huiwen doesn't know much about technology, I think he said something very right: small innovation depends on small factories, and big innovation depends on big factories.

Although a large factory has many advantages in people, money, and resources, there will be many internal problems after the organization grows, and the organizational efficiency is severely restricted.

If we firmly believe that the arrival of AGI, then there will be a huge explosion of new species. These things have huge opportunities for startups to work. This can be demonstrated from historical deduction, so as long as there is AGI, there will be new opportunities in the future.

Where is the difficulty in the middle?

OpenAI is a research-oriented company that implements products in the real world. You follow it, and the research field can have very dazzling achievements. But how to apply today, neither OpenAI nor technology-driven companies in Silicon Valley are very good at this. I am confident that China is much better than the United States in terms of application implementation.

The whole world has reached a turning point, and now that the technology is in place, this is the first difficulty. The application and requirements are the second difficulty, which is called model service (model service). So the challenge now is, number one, do you have the model? Second, does having a model mean having a service?

Zhang Peng:

Is selling an API a service?

Wang Xiaochuan:

I don't think so.

It's like you have driverless technology, but can you really build a car? Obviously not. It also requires the fusion of many technologies.

Now the United States is relatively confused about the application layer, and China's current problem is the lack of model capabilities. Today, many start-up companies that make models also limit their perspective to large models, and do not know much about other technology stacks.

Let me give you the simplest example. When making a model, you will definitely encounter hallucinations and timeliness. Both hallucinations and timeliness can be solved by the large model itself. Some people solve the illusion by expanding the parameters to trillions and tens of trillions; or use reinforcement learning. But in fact, the most direct way is to bring search and information retrieval into it. The combination of large models and these can form a more complete technology stack.

After this technology was proposed, it has already been a bit of a sign. For example, there is something called a vector database, which is actually a flexible search and is mainly used in toB.

In terms of search, after the Transformer technology was introduced in 2018, it already has the capability of semantic search. You may have heard of the inverted index, which is to index this symbolic network.

After 2018, whether it is us, Baidu or byte vectors have turned to semantic search, behind this technology are three huge vector databases. The combination of these technology stacks and the large model can make the large model develop further. As you can see, the search team experience has advantages for making models.

The second aspect is that large-scale model technology is gradually becoming practical. Then in the so-called knowledge computing, vector databases and searches need to be added to form more complete technologies and products. On this issue, everyone is gradually forming a consensus.

Talking about the traffic of ChatGPT today, everyone began to worry about whether it can continue to explode.

So we still need more exploration.

We believe that in the entertainment industry, role-playing has broad prospects, but this matter requires the entry of Chinese companies to do better.

Another thing is how to combine large models and search. Perplexity AI is doing very well now, but we are in a passive position. If the United States has opportunities, investors will look for Chinese comparison companies.

If this company, first, does not have a large model, it is only calling API; second, it does not have search technology, and can only rely on the technologies of companies such as Google and Bing, which is not ideal.

Zhang Peng:

Just now you said that the number of users such as ChatGPT is declining, which makes everyone feel that the new paradigm may not be able to break through all at once. Is this a great challenge for entrepreneurs to make applications?

Because according to what you just said, in an environment with immature technology, the cost of entrepreneurial exploration is very high. And if the entrepreneur just uses someone else's API as a change in the application paradigm, it is not particularly eye-catching.

Wang Xiaochuan:

Two days ago, OpenAI just upgraded the code interpreter, and then upgraded the custom instruction. For start-up companies, there has been tremendous pressure.

American investors are also worried about whether there is still a chance for start-up companies to surpass the giants amid anxiety, and whether they will be replaced by large companies after half of their work.

In China, I don't think there is a top company like OpenAI that takes the route of large-scale models. It is still in the stage of "Hundred Models War". Today, whether companies that make large-scale models have the ability to make applications is a matter that China has a lot to watch over the United States.

Catch up with GPT-4? It is dangerous to blindly pursue model intergenerational improvement

Zhang Peng:

This also raises the question of who in China will catch up to GPT-3.5, or even GPT-4

There is also another voice saying that GPT-3 is enough for enterprises to solve some vertical scene problems.

I feel that Xiaochuan and you are still committed to chasing GPT-3.5 and GPT-4. Is this chasing process difficult? Why do you say that you must catch up to the level of GPT-4?

Wang Xiaochuan:

I think it's two things.

The first is the intergenerational advancement of technology, which may have a crushing impact on the ecology of subsequent products. No matter from an ideal point of view, imagine a future that is far out of reach, like three, four, five generations of fighter jets, in which each generation may play an important role. So at this time, everyone should strive to seek an advantage in a highly competitive field.

However, in the process of competing for advantages, everyone may face new confusion: At what generation can we really realize super applications? GPT-3.5 has not yet formed a super application in the United States, and it takes about 50 million yuan to train once, which does not include the cost of preliminary preparations and experiments. GPT-4 training may cost 500 million RMB once. By the GPT-4.5 generation, the cost could reach $500 million.

Therefore, if there is no super application, it is very dangerous to simply pursue technological improvement. Therefore, we need to pursue the improvement of the fourth-generation and fifth-generation technical capabilities in this field at the same time, and at the same time have super applications. Otherwise, we could suddenly face pressure to upgrade on two fronts, both of which need to be upgraded at the same time to be successful.

Zhang Peng:

Therefore, every wave type of technology should be able to produce valuable applications.

Wang Xiaochuan:

What you just said is quite right.

GPT-3 is basically available on the B side, but I think the reason why it is not available on the C side is that the time is too short.

In addition, everyone focuses too much on OpenAI. It is not a product company, nor a company that can make super applications.

Being able to do super applications requires not only technical catching up, but also a sufficient understanding of the product. I think the end of the year is when the truth will come to light.

"Xiaochuan is suitable for making large models" "After 20 years of searching, no one said that I am suitable for doing search"

Zhang Peng:

Is it possible for everyone to overestimate OpenAI? In other words, we think it is difficult for OpenAI to be surpassed because it has a data flywheel. How do you understand this? Does the data flywheel really exist?

Wang Xiaochuan:

At the beginning of this year, I was talking about the data flywheel, and I was very panicked at the time. The user's request is given to it, it knows what the user wants, and then it can better upgrade the model.

So far, this problem is not serious.

For example, after the launch of Llama 2, everyone saw that in the fine-tuning stage, the data should be refined and less, rather than refined and more. Now everyone has gradually formed a consensus that the secret of technology is not in the flywheel, but the accumulation of technology over time.

The technical strength of Anthropic, a company like Claude's, has also risen rapidly, and OpenAI has far more users than it, which also proves that the concept of the data flywheel has been falsified.

Zhang Peng:

Or maybe the really valuable data in it is reflected in the skills of talking with people. I remember that at the earliest time, its way of talking was quite "dumb", but now it feels more human.

Wang Xiaochuan:

This thing doesn’t feel like much, and more lies in its data set. No matter in the pre-training stage or in the fine-tuning stage, how to get high-quality data? How to match and so on, this is its core ability. In particular, I heard them chatting that on the one hand, GPT-4 is training GPT-3.5 to make 3.5 work more efficiently, and at the same time, GPT-4 is used to produce part of the data required by GPT-5, and is optimizing the iterative process. Its iteration internally is to make online services better, and at the same time generate some data in the future. I think this is an internal flywheel.

Zhang Peng:

So if you look at it from the perspective of Llama, it is also possible to deliver a model that is constantly improving the technical level through open source. But from the perspective of OpenAI, there may be enough users and data at a certain stage.

Wang Xiaochuan:

Open source, closed source, and applications, in fact, everyone is still doing it today, and it is still in the expansion stage just like the western United States.

Zhang Peng:

So today, for a start-up company such as Baichuan, it is not easy to just say that I only cooperate with a certain direction, whether I want to keep the strategy vague, or call it rich in possibilities, I may bet on these dimensions.

Wang Xiaochuan:

right. My entrepreneurship this time is quite interesting. Many people say that Xiao Chuan is especially suitable for making large-scale models. I have been doing search for 20 years, and no one has ever said that Xiao Chuan is particularly suitable for doing search.

Doing search in China is 3 years later than Baidu. This kind of catching up is very difficult. Today's accumulation and experience are looked at, and there is no one in front of it. It was difficult to think, and it was several years late, but today it seems to me that there are opportunities everywhere. Therefore, as long as we have enough capabilities, we can interview here everywhere, and see if we can change your original impression of Sogou today.

Don't worship blindly, China and the United States will have different AI stories

Zhang Peng:

I am quite touched by what Ogawa said. You are finally standing in a no man's land.

Regarding the large model, many people may feel that they need to learn and catch up with OpenAI. But when you really do this, you will really find the distance and the path.

Wang Xiaochuan:

Yes, don't be so worshipful.

I remember that after AlphaGo in 2016, I mentioned two points at the time. First, if (AI) can predict the next frame of video, it is the arrival of AGI.

But it’s over after you’ve finished speaking, and you don’t have the ability to do it. You don’t have the motivation, ability, or conditions. Later, it was said that if the machine masters the language, then strong artificial intelligence will also come. Now the verification has actually started.

So I think we have a lot of ideas ourselves, and we are not behind. It's just that the timing and conditions are not ripe. It's like, a schoolmaster said that this matter can be solved, you schoolmaster, you don't mean to copy the homework for you, right?

Others tell you that it can be solved, or even give you a big idea. I think we can do it ourselves, and we don’t need to stare at other people’s homework and copy it.

Zhang Peng:

So your real fun here is not to realize and reproduce others, but to explore some things that everyone has not caught in this no man's land.

Wang Xiaochuan:

Yes, I think this time I have the opportunity to lead in some areas.

Zhang Peng:

There is such a possibility, so China and the United States, Baichuan and OpenAI, maybe it is not the same story.

Wang Xiaochuan:

It will indeed be different. China and the United States are not a system, a system, or a culture, so what grows out in the end, whether it is a technical issue or an application issue, will be different.

Main work: chatting with colleagues, Baichuan Intelligent just broke through 100 people

Zhang Peng:

How do you arrange your usual work? How do you allocate your time? Many people say that computing power is important and talents are important, but I think only those who really start a business know what is most important. So I just want to ask where do you spend the most time?

Wang Xiaochuan:

What I spend most of my time now is chatting with our colleagues.

Zhang Peng:

chat?

Wang Xiaochuan:

Yes, during the process of chatting, it is actually a process of continuously forming consensus, which is to gather everyone's cognition, nutrition, and external information so that everyone can form the same brain.

Because we know that Top-Down may go astray. For example, why didn't Google make it? When the first Google headquarters was done, it had its inertia. The data cannot be obtained, the cost will increase after going online, and there will be no specific user benefits, so there is a dilemma for starting a business.

In the case of Google Brain, it is Button-Up. Its researchers are very free and can do whatever they want, or join forces, so many of them have actually seen the large model, but they can't concentrate on doing it, because the forces are scattered. Then Deepmind is top-down, what the company asks to do, and what everyone does. It has done AlphaGo, AlphaZero, AlphaFold, and now it is moving towards hydrogen energy, nuclear fusion, and quantum computing, but it is far away from most of them.

In fact, Top-Down and Button-Up today are called "up and down with the same desire", so that everyone can fully align from this ideal to technology and cognition, and become one thing. So I think that more daily communication with you will make you a brain, which is my most important job.

Zhang Peng:

Hmm, interesting. Therefore, if a small team exerts its greatest energy, everyone can share the same joys and sorrows, and the same desires and desires.

Wang Xiaochuan:

This is very important. We won’t talk about organization and management today, but we will become like one person. Baichuan now has 100 people, and it just reached 100 people yesterday.

Changes in confidence, Baichuan's progress is faster than expected

Zhang Peng:

In the past few months, you have devoted yourself to large-scale modeling. Your enthusiasm has not changed, but has your confidence changed? Is it more difficult than you expected at the beginning? Still in line with your expectations?

Wang Xiaochuan:

If I'm being honest, from the bottom of my heart, I think it's easier than I thought.

The expectation is that you know a lot (difficulties) at the beginning, years of accumulation abroad, computing power, servers... But when you work with your colleagues, when everyone is co-creating, our actual progress and pace are faster than expected.

It turned out that we expected to release a 50 billion model in Q3, and then release a 175 billion model in Q4.

But in fact, these things will not change, but in the process, the speed of application progress and the speed of open source models are much faster than expected.

And also faster, today we can start to say, not only to be the best at home, but also open source on the international stage.

Open source is very simple, everyone can use it for their own use, this is not something you brag about. After the opening, we are confident that we can get a very good position on the international stage.

Zhang Peng:

So before jumping into the water, I was anxious about not knowing the depth, but when I really jumped in, I found that I could actually step on it to the bottom, so I felt a lot more at ease? Is it really that simple?

Wang Xiaochuan:

Divide people.

I am a relatively cautious person, I was still watching, and then our Lianchuang kicked me in and said to start doing it. Then I said okay, announced the end and started working. Otherwise, it may be even later before you feel ready. But once you get off the field, you will find that you will run faster than you thought.

After the frenzy, recent technological developments of concern

Zhang Peng:

Are you paying attention to the technical progress of large models recently? Which papers excite you?

Wang Xiaochuan:

First, just reading papers is actually not important today. You can't finish it.

The basics are those things. And today OpenAI has stopped sending out good papers. The papers that are sent out are all papers with a small amount of information, and the harvest is limited.

At the same time, because everyone has entered into a frenzy (state) before, we call it "living the day like a year", and it seems like a year of (technological progress) every day.

Zhang Peng:

Days feel like years because it runs fast.

Wang Xiaochuan:

Yes, not boring. There are so many new things every day. Everyone's nerves have been stimulated to a high point, and they are a little weak.

Having said that, there have been several technological advances recently, which I think are very powerful.

One is that, about a week ago, OpenAI launched the code interpreter, which is a major breakthrough, but it seems that there is no new round of media frenzy in China.

Everyone has enjoyed the frenzy before, but this time, the progress, code interpreter, I think the media did not pay enough attention to it and reported it.

And a small upgrade yesterday, customize your own instruction.

It represents that it starts from the model LLM to Agents.

It will describe "Who am I and what are my characteristics?" What kind of role will your big model play and what are its characteristics? Forming such a relationship depends on whether the model is an Agent (from the point of view).

These two areas are not enough for everyone to pay attention to and report today.

The decision to start a business finally waited until the no man's land "suitable for Xiaochuan"

Zhang Peng:

The last question, you just mentioned that you were "kicked" into this entrepreneurial situation. I also know that you have been an AI fanatic since the time of AlphaGo.

In the end, you made up your mind to become an entrepreneur in the field of AGI and large-scale models. Whether it is Lao Wang (Huiwen) or the wave in China, how have they influenced your decision-making? After going through such a process, what kind of changes have taken place in your heart?

Wang Xiaochuan:

The mental journey is actually quite long.

At the time of Sogou, to the later stage. First, we missed the recommendation engine and made a strategic alliance with Tencent. In this case, without new technological breakthroughs, the development is very limited. When Sogou was merged with Tencent, I was challenging a more interesting thing, which was to turn life into a mathematical model. Like we said, Newton was turning physics into a mathematical model.

On the platform of Geek Park before, I was talking about learning from life.

what is life This is something I've been thinking about for 20 years.

How to turn life into a mathematical model? This is what I care about. Even in the study of Chinese medicine, how to turn life into a mathematical model, (later) found that this road may not be smooth.

I am particularly interested in how to make new breakthroughs in medicine within the scientific paradigm. I read far more medical papers than I read computer papers, I read thousands of medical papers.

What happened in 21 years? In 21 years, the big model has begun to have some opportunities. At that time, we made a tens of billions model to solve the problem of changing search to question and answer.

In fact, before doing the input method, I was already engaged in "predicting what the next word wants to say", and then how to rewrite it, the search turned into a question and answer. In fact, the door was touched, but there was no breakthrough in technology at that time.

So you know that I am very interested in turning life into a mathematical model, so after the arrival of the big model, my first thought was not to make a big model. I want to ask, is it possible to build a Health ChatGPT in the field of life today? Healthy GPT, a digital doctor?

Zhang Peng:

You are thinking from an applied, problem-solving perspective.

Wang Xiaochuan:

Yes, thinking about it. Then I thought, if you do a vertical model today, it might be killed by the big model. General intelligence kills specialized intelligence, right?

But in this case, we found that it is not enough to only do one kind of HealthGPT, or just a digital doctor.

In the end, it is still necessary to make a large model.

(The decision to make a large-scale model in the next stage) came back from such a circle, not because I thought I had accumulated before.

But when we make a large model, we find that (previous accumulation) is quite relevant, such as language-related processing.

Even to the extreme, ChatGPT is the third to make the language model a super application. The first two, one search and one input method.

Zhang Peng:

I feel like you didn't do this too, sorry for the two you did before.

Wang Xiaochuan:

Yes, so I found that the previous accumulation is really useful today, which I didn't expect before.

So I am very emotional, God is very kind to you and gave you a chance. At the end of the search, there is another chance to use the previous experience to do something that could not be done before.

Now, no one said "Xiaochuan is suitable for search", but everyone said "Baichuan is suitable for large-scale models". For me, it is a very lucky thing.

Zhang Peng:

This is why you decided to do it in the first place.

After a few months now, everyone may find this difficult. OpenAI has not yet become a super profitable company, and many people in Silicon Valley have questioned its business model. So the big model will put pressure on entrepreneurs in this aspect. Have you felt this pressure?

Wang Xiaochuan:

I'm all hyper.

Because I used to work in the shadow of Baidu, but now it is no man's land. For me, this is exactly what I want to do, instead of saying that there is a leader in front of me, and then you follow. For me, this is what I like, a new exploration.

Zhang Peng:

Special thanks to Xiaochuan for sharing with us today, and congratulations on finally ushering in your no-man's land. I hope to see more beautiful scenery here. Applause is dedicated to Xiaochuan, come on!

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

3 Likes

Reward
3
Comment
Repost
Share

Comment

0/400

No comments

Topic
#Gate Initial Global Listing YZY
15k Popularity
#Crypto Market Rebound
189k Popularity
#FOMC July Minutes
14k Popularity
#Gate Alpha ESPORTS Points Airdrop
15k Popularity
#Crypto-Related xStocks Rally
3k Popularity

Sitemap