Rendered at 05:41:55 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
Nifty3929 14 hours ago [-]
China may be subsidizing this for now in a way that US companies can't or won't - but if they keep building power infrastructure and the US doesn't, then it will no longer require subsidy from them. It will simply be absolutely cheaper (including profit margin) to serve tokens in China.
China is building for the future, while Western Democracies are afraid of the future, and of their own shadow.
hedora 13 hours ago [-]
I'm not sure how much of it is subsidies. If the open weight models are anything to judge by, China is taking price performance seriously, and the US model vendors are looking for performance at any cost. Like any other Pareto optimization, we end up paying 10x more for the last few percent improvement on benchmark scores.
Of course, like literally every other time this has played out in computing history, the companies focused on price performance will end up with more economic resources, and get to turn the upgrade crank more often and for longer.
Also, of course, China's way ahead of the US on things like renewables, batteries, and electrification of their economy. All of that feeds into cheaper power to run the models, but I suspect it's a second order effect vs. "improve the software".
submain 8 hours ago [-]
It seems to me China is chasing widespread adoption, while the US is chasing the AGI dream.
layoric 8 hours ago [-]
They also banned crypto mining which previously was using the free to cheap electricity, so if AI data centres are using those now under utilised supply, very possible subsidies are very low.
aftbit 6 hours ago [-]
And yet despite the ban, China's contributions to Bitcoin mining remain very large.
not just renewables, also massive nuclear capacity and huge modern coal plants. They can really crank up capacity if they want to. How long will it take to get a new nuclear power plant operational in the US?
weakened_malloc 2 hours ago [-]
Someone (can't remember who) said it best. US is the best at going from 0 to 1, China is the best at going from 1 to 100.
dluan 35 minutes ago [-]
This has been a meme in Chinese tech/startup world lately, as it's now the main problem they are trying to solve. They largely consider 1 to 100 solved, and have set their sights on the new goal.
radialstub 10 hours ago [-]
> Of course, like literally every other time this has played out in computing history, the companies focused on price performance will end up with more economic resources, and get to turn the upgrade crank more often and for longer.
The iphone is the best selling computing device in history and is among the most expensive in its category.
palata 9 hours ago [-]
Most smartphones being sold are Android, though.
radialstub 9 hours ago [-]
True, however apple makes the overwhelming majority of the profit in the smartphone market.
dietr1ch 8 hours ago [-]
For most people Apple's main selling point is about showing off the cute devices and battery life, but that's not going to play a role when users are free to choose the tool that will call the models.
swingboy 7 hours ago [-]
You might be vastly overestimating what a majority of phone owners use their phone for.
rootusrootus 7 hours ago [-]
I’ve never seen anyone show off an iPhone. What a weird take.
dietr1ch 6 hours ago [-]
I was talking more about laptops, but haven't you seen people sms bubble colour-shaming?
1over137 3 hours ago [-]
Really? What country are you in?
9 hours ago [-]
onlyrealcuzzo 13 hours ago [-]
> China may be subsidizing this for now in a way that US companies can't or won't
They're subsidizing this in many ways - Huawei chips, new DDR5 memory fabs, etc.
Ultimately, DeepSeek's architecture is significantly more cost effective than anything from Google, OpenAI, or Anthropic.
Presumably, they'll incorporate DeepSeek's MLA* architecture to get all the benefits for next year's releases (if not this year's upcoming releases) which will bring down their costs...
They need to actually make money, though, so that might still not give them enough room to make enough money.
Ultimately, hardware depreciation is like 80% of total spending. So power is not as big of a deal in cost. The bigger problem is if you can get the power at all, not how expensive it is.
If you want to bring down inference costs, using less hardware is far more effective than getting cheaper electricity.
Google is in a sweet spot, because they aren't paying 80% margins to nVidia for hardware. So they're probably paying half as much deprecation as everyone else is (or maybe 1/4th for inference - which is now the biggest percentage overall).
nl 6 hours ago [-]
> Huawei chips, new DDR5
The US is subsidizing in exactly the same way through the US Chip Act (as well as state level tax subsidies):
> The act includes $39 billion in subsidies for chip manufacturing on U.S. soil along with 25% investment tax credits for costs of manufacturing equipment, and $13 billion for semiconductor research and workforce training
> Presumably, they'll incorporate DeepSeek's MLA* architecture to get all the benefits for next year's releases (if not this year's upcoming releases) which will bring down their costs.
You can be sure the frontier labs all have similar approaches, but they just don't talk about them. That's why eg Google Flash (the old versions!) were do cheap.
I mean Google published MTP a month or so ago and it has sped up Qwen models by 1.7 times.
If that is what they still publish you get an idea of what they aren't.
onionisafruit 13 hours ago [-]
What’s the TLA architecture? I haven’t read about that.
Look up the US deficit- they have been subsidizing everything since the 1980s.
toddmorey 13 hours ago [-]
It feels like the US for years has operated under the assumption that homeostasis for the global economy would always be “designed in California, assembled in China.”
Like there was something in the American DNA that was lacking in China and innovation would always need to happen here.
But China it seems doesn’t need the US to produce great cars, devices, robotics, or AI. We absolutely need China to help us build all of the above.
HerbManic 6 hours ago [-]
The one major area they are still behind in is CPU tech, but they are hungry and thus moving quick.
Looking at Loongsons processors for instance. About 15 years ago they coudl barely compete with a Pentium 2. Now they are about 4-5 years behind Intel/AMD. Further behind on some more specific work loads (SSL decoding for example) Not great but that is a decent jump. The jumps between generations are pretty decent.
LA446 was a decent enough processor core but had an awful memory controller that held it back as soon as it needed to reach outside of cache. As such it was SLOW.
But they learned the lesson and now the LA664 almost entirely fixed that issue. I think a big part of performance issues is that they are working domestic 5 to 7nm processes, so a good 5-7 years behind.
They are launching the LA864 later this year and are touting some decent performance gains. That is just marketing so far but something to keep an eye on.
Considering that these chips are using their own ISA, own designs, domestic manufacturing and they aren't terrible is a big thing.
I suspect in the next 5 years they have the chance of completely closing the gap. But it can also go the other way that they end up stalling as smaller nodes get much more difficult to attain.
TedDoesntTalk 5 hours ago [-]
How much does corporate espionage help them?
HerbManic 38 minutes ago [-]
Who knows but any 'answer' anyone could give is pure speculation.
You could be right! But I do see this claim come up every time Chinese tech comes up. It might be a valid concern but it might also just be folks attempts to try and undermine the technology gains of the nation.
The ISA they have developed with based off years of with with MIPS and RISC V, so it isn't entirely new but they are definitely pushing it forwards. I have no idea if any of their developments could be back ported down the RISC V.
shimman 4 hours ago [-]
Probably the same amount it helped the US in the late 1800s/1900s, a substantial amount.
fridder 13 hours ago [-]
Might be more far to say: they needed the US until they caught up. The massive straight up IP theft helps a lot here. Though theft might be too strong since a lot of companies knew what they were getting in to
palata 9 hours ago [-]
> The massive straight up IP theft helps a lot here
I think this is vastly underestimating what "catching up" means. All my life, people have been saying "China copies". Now they are objectively better at many things (including robotics), and... well it seems that we cannot "just copy".
I saw western companies trying to "copy" superior Chinese technology, talking to brilliant engineers explaining how much they were learning by actually trying to copy.
The lesson I got from that is that China did not "copy"; they learned. And it took time, and now they are better. Now the western world has to learn from them, I guess.
jubilanti 8 hours ago [-]
Growing up moving around both conservative and liberal parts of the US, from middle school to college, I distinctly remember several US history classes where I was taught the exact same narrative about Samuel Slater. About how he was an American hero and the Father of the American Industrial Revolution because he memorized a bunch of industrial patent blueprints and brought them over to the US.
It got told as: the evil English made it illegal to even import blueprints for factory machinery, to keep the colonies in resource-extractive poverty, so they'd have to send raw materials overseas to get processed, then import the finished goods. (My other history teacher, the Anno / Dawn of Discovery video game series, also cemented this bit about resource extraction in my head at a young age.) But then thanks to heroic ingenuity and cunning, I was told, the US was able to outwit the colonizers and process its own raw materials, eventually gaining full economic, military, and political supremacy.
Sounds familiar.
aucisson_masque 8 hours ago [-]
It's ip theft when the Chinese do it but when it's the American copying on Chinese it's called learning.
whatshisface 9 hours ago [-]
Producing great products is a game at which every player wins, because sellers must find willing buyers. It only fails if one participant panics and jumps out of the window, or if a significant number of people are not participating (this is always the case when wealth inequality is involved).
Projectiboga 9 hours ago [-]
China is out producing us at new scientists and engineers.
windexh8er 1 hours ago [-]
> The lesson I got from that is that China did not "copy"; they learned. And it took time, and now they are better. Now the western world has to learn from them, I guess.
And Apple played a huge role in teaching them. We should all thank Tim Cook and team for almost single handedly bootstrapping China 2.0, the China that runs circles around the west in terms of production and development.
Peter Zeihann really got it wrong in his latter books.
toddmorey 11 hours ago [-]
Ok, not my favorite narrative, but assume asymmetric application of intellectual property rights was a big factor. Wouldn't the US exploiting asymmetric labor wages, rights, and conditions be the even bigger story? It still feels like a short-sighted own goal. The US abandoned its ability to manufacture. Maybe dark factories and robotics can bring it back, but manufacturing supply chains are just so much more advanced in Asia than in the US.
smallmancontrov 7 hours ago [-]
> Wouldn't the US exploiting asymmetric labor wages, rights, and conditions be the even bigger story?
Yes, but "the US" is reductive. The exploitation wasn't done by the towns having their tentpole industries shipped overseas, it was done by the people shipping them overseas and pocketing the profit. US capital owners made a deal with the Chinese Communist Party that was good for both of them and bad for the US.
tmnvix 35 minutes ago [-]
And good for the people of China presumably.
airstrike 6 hours ago [-]
That's really well said.
The promise was always to get cheaper goods and services in the US, so long as the Chinese firms never competed. Guess what, they compete now.
xbmcuser 8 hours ago [-]
Lol it was not ip theft it was American and European companies building factories in China themselves teaching them how to manufacture use their cheap labour. Well they learned and as they were the dong the manufacturing got better at it. I believe the current aerospace industry which the US leads in is also result of IP theft from the British then out innovating them.
defrost 17 minutes ago [-]
> I believe the current aerospace industry which the US leads in is also result of IP theft from the British then out innovating them.
Jet engines, proximity fuzes, radar, how to make a nuclear weapon, etc. are all examples of British / Commonwealth technology "gifted" or "traded" to the USofA during the WWII years in exchange for production.
So, not IP theft .. but absolutely foreign ideas taken in by the US and built upon.
swasheck 9 hours ago [-]
IP theft may only be part of the story though. it’s a question of priorities. US optimizes for profit which can place limits reinvestment. China seems to optimize for ubiquity and dominance, and has the capital to throw at those goals. when you’re beholden to the shareholder/ceo/investor, you make concessions to stay within their will. when you’re beholden to the state, you do the same.
gmerc 8 hours ago [-]
Talking about IP theft with a straight face in context of AI.
lol. Not that kind of IP theft, that doesn’t count.
Wait until you hear about the history of US industrialization. This trope of 'they stole our ideas' needs to fade away, it's a coping mechanism based on the assumption of inherent superiority of American society rather than the natural wax and wane of civilizations due to varying structural factors.
lejalv 10 hours ago [-]
This so much. You can also read up about when Germany sent industrial spies to Great Britain. And the first documented case of industrial spionage was against... China.
It plays this way: you're behind, you ignore IP rules. You're ahead: you create them to defend your newly-gained status.
Also please no moralizing here on IP when the entire OpenAI/Anthropic playbook has been "massive straight up IP theft". The irony.
dangus 11 hours ago [-]
At some point we can’t keep blaming IP theft for obvious innovation and investments being made by China.
We also can’t blame subsidy. All countries subsidize their industries.
This video on the auto industry covers a different industry but has a lot of the same rhymes as far as China’s strategy:
1. Treats low margin industries like mining and utilities as areas to focus investment and come up with incremental improvements, making those available to all companies. The West, by contrast, allows private companies to handle those industries, who logically don’t bother investing in them since their investors consider those basic industries to be low-value segments of the production chain. But now we see those advantages in China where investments have been made (e.g., the best battery chemistries and mining/refining, the cheapest power (when was the last time your local utility company focused on reducing pricing?)).
2. Because all companies in China have access to the same excellent infrastructure, they must compete furiously on quality/features/price of their products.
3. China allows foreign competition so long as they operate in China (see: Tesla) further insisting that their domestic products be globally competitive and that foreign products sold in their country benefit their local ecosystem.
api 11 hours ago [-]
The US committed massive IP theft in the 19th century when we industrialized.
ceejayoz 11 hours ago [-]
As did the big AI providers.
falcor84 10 hours ago [-]
I would appreciate some reading pointers about this.
nl 6 hours ago [-]
> Samuel Slater ... known as the "Father of the American Industrial Revolution", a phrase coined by Andrew Jackson, and the "Father of the American Factory System". In the United Kingdom, he was called "Slater the Traitor" and "Sam the Slate" because he brought British textile technology to the United States, modifying it for American use.
> He learned of the American interest in developing similar machines, and he was also aware of British law against exporting the designs. He memorized as much as he could, and departed for New York City in 1789. Some people of Belper called him "Slater the Traitor", as they considered his move a betrayal of the town where many earned their living at Strutt's mills
What? How is someone born in 1947 relevant to ip theft in the 19th century?
imjonse 10 hours ago [-]
'usa ip theft 19th century' in your fav search engine
falcor84 6 hours ago [-]
Well, I did, and to save others the time, the most relevant resource I found appears to be the book "Smuggler Nation: How Illicit Trade Made America” (2013) by Peter Andreas
gedy 7 hours ago [-]
Sure but I think what people are actually concerned with today is China copying a product and dumping cheaply back in the country it was taken from. That scale and speed is not what was happening in the 19th century.
I personally have little issue with countries doing that for domestic use (I hate using term "IP theft"), but to re-export so quickly you can't run a viable business in your own country is not fine.
FpUser 8 hours ago [-]
>"IP theft"
Can we stop this crying baby already. Every country has stolen from the other. Did you really expect countries to settle on sewing closes and ship all profits to foreign companies for eternity? The IP is just an artificial concept that participants follow for so long as it benefits all parties.
encrypted_bird 9 hours ago [-]
> Like there was something in the American DNA that was lacking in China
In most Americans' eyes, unfortunately, there was. It was just known by the name "American Exceptionalism". Yes, it's nonsense, but unfortunately it is nonsense that has historically been used by most empires throughout history, and believed just as fervently by said empires' populi since it's one of the central elements of imperialism as a whole.
TurdF3rguson 8 hours ago [-]
The US models are still better though, let's not get carried away. Ours are better, theirs are cheaper. That's how it's always been.
roncesvalles 10 hours ago [-]
>Like there was something in the American DNA that was lacking in China and innovation would always need to happen here.
There is (was): attracting the best minds around the world to a free and stable society. Trump voters threw it all away because they couldn't stand non-whites coming to America and doing better than old stock Americans.
DaSHacka 8 hours ago [-]
> attracting the best minds around the world to a free and stable society.
China is comprised of ~91.5% ethnically Chinese citizens. [0]
> Tump voters threw it all away because they couldn't stand non-whites coming to America and doing better than old stock Americans.
The U.S. is more diverse than it's ever been [1], and under Trump we're still below the deportations of Obama's terms.
Sounds like open-borders immigration was never necessary in the first place, given that we're being beat by a country with a similar demographic skew that we had like 80 years ago. Coincidentally, when we arguably had our best economic opportunities for citizens. Who'da thunk.
Clearly, the only solution to our fading relevance is opening the border again and importing 500 million more ""doctors and engineers"" all the while China is investing in their *actual* doctors and engineers, and has extremely strict immigration policies [2].
You're conflating Mexican border hoppers with skilled immigrants.
I'm absolutely opposed to illegal immigration and have a more extreme position on how to deal with it than most Americans.
What I'm irked by are Trump's attacks on legal immigration and the general worsening of the environment. ICE's kidnappings, the 100k H-1B fee, and the recent Green Card thing have deeply eroded America's attractiveness to legal immigrants.
I think when MAGA came after H-1Bs, it became pretty clear that it's not about law and order, it's just a race thing.
And if you want to go gloves off, I'll just say it: the main problem in America is that its 3 major ethnic groups are infected by anti-intellectualism and slothfulness, whereas the Chinese and various other cultures are not. The direct benefit from skilled immigration is so that we can increase the ratio of people who actually value education and hard work vs the failing old stock Americans whose broccoli-headed kids dream of becoming YouTube influencers instead of astronauts.
CuriouslyC 3 hours ago [-]
H1-Bs are the most egregious example, because they're 100% used as a way to undercut/replace American talent. The irony is that the typical border hopper is working jobs Americans don't want, for wages Americans wouldn't take, and they keep a low profile to avoid getting deported.
The desire to be influencers isn't as boneheaded as you think, in a future where AI is solving the hardest technical challenges, the ability to get attention and create community is the last frontier. Influencers and salesmen will be eating good when scientists and engineers are derelict.
SmirkingRevenge 3 hours ago [-]
> The U.S. is more diverse than it's ever been [1], and under Trump we're still below the deportations of Obama's terms.
Ethnic diversity is neither really here nor there in terms of the measurable needs that immigration fulfills. Immigration keeps economic and population growth rates trending up. Having high skilled immigration to bolster science and research is nice, but it's still mainly about the growth.
Yea, Obama deported lots of people, but even then we still had net positive migration. Now under Trump, we have net negative migration for the first time in decades. The very public terror campaign waged by the Trump admin was in part to deter immigration in the first place.
> Sounds like open-borders immigration was never necessary in the first place, given that we're being beat by a country with a similar demographic skew that we had like 80 years ago.
1) Economic growth is possible with stagnating/declining population levels if you overcome those deficits with commensurate increases in productivity per capita. Otherwise, you're cooked.
2) The US is actually far more productive per capita than China - in fact, the US is one of the best in the world, as far as that goes.
With those points in mind, we can begin to see why China has an easier time growing economically with little immigration. The US has a much harder time doing the same. We need more population, since it's just harder to squeeze more productivity out of our already very productive workforce.
Once China achieves similar productivity levels, they will need to rely more on growing the population.
We were actually on track to catch up to China's population levels in a few of decades (thanks to immigration). So unless China successfully pivoted to mass immigration or expansionism, the US was likely to remain dominant - easily so - for the foreseeable future.
That's why the MAGA anti-immigration push is so tragically stupid and suicidal (if it persists). They're killing America's golden goose.
As an aside: I wish the "open borders" canard would die. We've never had open-borders immigration in recent history. Definitely not since 9/11. Not even under Biden. Border laws were enforced. Biden has the same apprehension rate at the border as both Trump and Obama.
avadodin 7 hours ago [-]
That's such a gross misrepresentation of reality.
First of all, the only group of immigrants targeted by the admin are those critical of certain middle eastern regime.
Republican racists mainly care about the immigrants that do not take their middle-class jobs anyways.
Anti-Indian hate is restricted to a minority of software engineers and anti-Chinese hate is virtually non-existent.
I do believe it is idiotic to have your universities full of Chinese, your manufacturing in China and, at the same time, treat China as a geopolitical enemy.
dzonga 6 hours ago [-]
people might not wanna admit it because it feels politically incorrect - but that belief is massively due the idea of "western (white) supremacy".
cz if you're smart & pragmatic - then you will know innovation can come from anywhere - but western elites choose to continually bury their heads in the sand.
delfinom 13 hours ago [-]
Propaganda. We americans ate that shit up.
There's nothing special about anything we design in the US other than time and money commitment to create it. China did have some espionage of course going on, but the vast majority of shit isn't some secret. And with the US shitting on China with restrictions, we increasingly caused them to invest time and money into things they otherwise would have passively accepted as coming from the west. ASML sees the writing on the wall for themselves in particular.
jdcasale 11 hours ago [-]
It's both.
The US has generally resorted to propaganda rather than addressing the self-inflicted structural conditions responsible for the erosion of our dominance. China also conducted a broad, sustained, large-scale campaign of IP theft across almost every industry.
Obviously there is no natural law preventing China from innovating (We have treated political liberalism as a prerequisite to innovation in a way that was always partly self-congratulatory), but it's also obviously true that the speed of the gap closure is due in significant part to theft.
That doesn't change the fact that they are now a legitimate competitor who has gotten a lot of things right (and among these, some things that we get very wrong) and probably actually leads in some areas.
infecto 11 hours ago [-]
I like this take a lot and agree with it. The US for too long has been asleep at the wheel on many areas, power generation one of them. China with no doubt has conducted very deep and sustained espionage campaigns and even with LLMs there is enough evidence that most of the initial gains was training off of western models. Again no complaints here but I think it’s important to acknowledge both which can be true at the same time.
FpUser 8 hours ago [-]
>"Again no complaints here but I think it’s important to acknowledge both which can be true at the same time."
and this acknowledgement will pay your bills
infecto 4 hours ago [-]
Huh?
clear-octopus 10 hours ago [-]
[dead]
13 hours ago [-]
clear-octopus 10 hours ago [-]
[dead]
sandworm101 11 hours ago [-]
As john oliver said on conan many years ago: "an inflatable barbecue!".
China can certainly design an inflatable barbecue. China can certainly biuld an inlfatable barbecue. But will the chinese people ever want and buy an inflatable barbecue? ... never. That is why the US will remain the premier consumer economy.
nl 6 hours ago [-]
The US is the richest consumer market in the world.
And yet BYD is likely to outsell Ford worldwide this year (despite being banned in the US)
I feel like the chinese government see this in terms of the space race.
Not that, there's a cool new frontier to explore.
But that its a great opportunity to subsidise an industry and watch their slower fatter competitor go bankrupt trying to keep up.
>But the US did it first
What is sputnik.
bcrosby95 13 hours ago [-]
Put another way: if the average US citizen doesn't subsidize the costs of these trillion dollar companies, China is gonna come get you. Funny that you talk about being afraid of your own shadow.
I have some exposure to utility regulation and from what I can tell some of the AI companies are "good actors" and willing to shoulder some of the burden. But others are pretty adversarial and want a free lunch.
bryanlarsen 13 hours ago [-]
Power is foundational to pretty much everything. Cheap power is going to give China a massive advantage in everything; AI is just incidental.
seviu 12 hours ago [-]
Cheap power at what cost for our planet?
Not long ago we were crying death to bitcoin, it’s going to destroy the planet.
Come AI, with unlimited power demand. Everybody screaming we need more power.
We need infrastructure, clean energy, even nuclear. We are doing all in the wrong order.
FabCH 11 hours ago [-]
China added 315GW of solar in 2025.
For context, EU added 65 and US
43.
In one year, China _added_ almost the total capacity EU has.
China is the one place where AI actually can use clean energy…
DennisP 8 hours ago [-]
China also has 1,271 GW of coal capacity, and is planning 500GW more.
bryanlarsen 8 hours ago [-]
And their coal capacity factor (ie the percentage of time they use their coal) is dropping at about the same rate.
tedd4u 9 hours ago [-]
Possibly France, too.
- 70% nuclear
- 26% renewables
- 4% gas/coal
stubish 44 minutes ago [-]
That power is already being used, and excess exported to neighboring countries.
usrnm 8 hours ago [-]
France cannot really add more of it. Not fast and cheap enough, anyway
lejalv 9 hours ago [-]
...and China manufactured almost the totality of the EU and US solar capacity.
bix6 11 hours ago [-]
China is the leader in solar?
seviu 8 hours ago [-]
For a while already
margalabargala 13 hours ago [-]
> Put another way: if the average US citizen doesn't subsidize the costs of these trillion dollar companies, China is gonna come get you.
The future is blatantly going to be electric. Between cars, heat pumps, ranges, etc, the quantity of kilowatt hours consumed will rise dramatically per capita because they are replacing burned fossil fuels.
We don't need to subsidize the trillion dollar companies, we can settle for just not cancelling wind and solar projects, and generally updating the grid infrastructure.
A rising tide lifts all boats. If the subsidies go to common infrastructure, that's good for everyone. There's no need to complain about a road being paved because it will benefit FedEx in addition to everyone else.
skeledrew 12 hours ago [-]
> not cancelling wind and solar projects
Tell it to the guy doing just that, as much as possible.
swasheck 9 hours ago [-]
windmills cause cancer and kill bald eagles so we can’t do wind. /s
jm_l 13 hours ago [-]
All public infrastructure benefits the public but the role of our governance is to correctly prioritize. $100 billion spent on nuclear power plants is $100 billion being withheld from other critical social services.
manyatoms 12 hours ago [-]
The US could very causally spend a couple $100B less on their military and not have a real reduction in capability.
margalabargala 13 hours ago [-]
> $100 billion spent on nuclear power plants is $100 billion being withheld from other critical social services.
What? No it isn't.
There are many places the government could use to appropriate funds, not just social services. The military, for example. Other subsidies. Tax credits. Simply increasing the debt.
coliveira 12 hours ago [-]
No, the money is not coming from a fixed box. When the US wants to do something (typically starting a new war), they never ask where the money is coming from. This tells you everything about how the decisions are made, if it is a priority for them, they will spend the money first and ask questions later. If green infrastructure was a real priority they would invest the money and later find ways to pay for it.
lobocinza 6 hours ago [-]
But those wars are typically fought to maintain the US status, to preserve its ability to debase the national currency effectively siphoning wealth from the world economy. Self-preservations comes first. I'm just describing the system.
airstrike 6 hours ago [-]
Which of the wars in Iraq, Afghanistan, and now Iran were fought to maintain US status?
margalabargala 2 hours ago [-]
All of them. Afghanistan as a response to 9/11, Iraq as a regional power flex and entrenchment. The first certainly preserved US status more than letting 9/11 go unanswered, the second to remove a hostile regional power and replace it with a friendlier government.
Now in Iran, the intention was to repeat Venezuela and effect regime change in a hostile country, bolstering America's military status.
Whether these wars have the effect they intend is beside the point; you're asking why they were fought, not whether they resulted in "Mission Accomplished".
gmerc 8 hours ago [-]
Remember kids, in the west it’s “investment”, in China “subsidy”
Aboutplants 14 hours ago [-]
I believe you are right. These models are at worst a 6 month lag to the costly frontier models, but the ability to scale energy production is years ahead of where the US is. That advantage is often under appreciated
Their cost of energy is what matters vs the US as much as speed buildout.
mxschumacher 10 hours ago [-]
I'm still not entirely clear on the problem <-> capability matching. E.g. it seems like Kimi K2.6 with good context would already be able to solve a huge chunk of problems. What share of prompts require frontier models?
itemize123 3 hours ago [-]
they are doing it through state investment vehicles - so it's in the same way US companies can (but won't)
energy123 13 hours ago [-]
It's not really a bottleneck. US capital is building data centers in South Asia, MENA and SEA. Many of these countries offer tax breaks because they want US data centers, and they have abundant equatorial land for solar.
You might say that US would prefer sovereignty but that's a separate argument vis-a-vis strategic competition with China in particular.
delfinom 13 hours ago [-]
Wonder if they are finally exploring installing anti air defenses on these datacenters given they are massively expensive and devastating targets of extreme opportunities.
dartharva 9 hours ago [-]
> while Western Democracies are afraid of the future, and of their own shadow.
Trillions of Dollars being invested against AI infra would indicate otherwise. US is in fact betting a lot of its economic future on AI.
windexh8er 1 hours ago [-]
It's not western democracies. It's western capitalism, and more poignantly, western billionaires. They're feeding the narratives. Peter Thiel, Sam Altman, Elon Musk, Mark Zuckerberg - they're the ones with bunkers and exit strategies. They are the lunatics buying seats at the political table and spreading FUD and meddling in our elections. They are the ones destroying the west's chances at a competitive future, instead: "capitalism".
They wanted the division, they're getting it and one side is raping and pillaging the masses.
epolanski 7 hours ago [-]
> China is building for the future, while Western Democracies are afraid of the future, and of their own shadow.
Yes, countries where compromise is not required, where social, capital and human costs are non-factors and where regulations are bendable at will by who's in power can be more effective at achieving some goals.
dominotw 7 hours ago [-]
> China is building for the future, while Western Democracies are afraid of the future
who are the decision makers in china?
tmnvix 15 minutes ago [-]
The 96 million members of the CCP and the entrepreneurs incentivised to make decisions based on the policies they introduce?
Who are the decision makers in western democracies?
I'm being slightly facetious - there are many answers to these questions.
The one that actually matters to me though is "do the people that are making the decisions do so in the interests of society?" Not in my 'democracy', that's for sure.
themafia 11 hours ago [-]
> then it will no longer require subsidy from them
Is there actually a huge Chinese consumer market for these products? If not then I'm not sure how you ever actually achieve this endpoint. Chinese wages and American wages are not nearly the same thing yet.
> It will simply be absolutely cheaper (including profit margin) to serve tokens in China.
It will simply create more pollution and environmental destruction too.
> China is building for the future
That's the plan. Whether that's true requires an honest analysis.
> while Western Democracies are afraid of the future
Developed nations take fewer risks than undeveloped ones. Do you assume this pitched dichotomy will naturally sustain itself?
> and of their own shadow.
Yea, it's funny what having open and fair elections can do for a country.
lejalv 9 hours ago [-]
You got me with fair. Gerrymandering, PACs, two-party system, electoral college.
Where do we start...
themafia 9 hours ago [-]
We start logically. Do you presume your handful of cases exemplify the entire Democratic system? Do you assume that "China" is best understood as a single centralized entity?
You completely walked past the argument to pick at a meaningless nit.
lejalv 9 hours ago [-]
Handing out lessons in democracy from the record-holder country in foreign intervention (https://en.wikipedia.org/wiki/United_States_involvement_in_r...) had equal civil rights only in the 1960s, pardoned the perpetrators of Jan 6, has its supreme court in entirely political hands, and has the awesomest repressive force in the world, together with the incarcerated population to go with.
Maybe I picked like 4 meaningless nits as in: US politicians respect so much democracy that they constantly reweight "one person, one vote" to suit the interest of the incumbent, they do not have their outrageously expensive campaigns financed (legally) by private interest groups, the popular vote is represented, and elections are uncontested (unless the wrong candidate wins, where the Supreme Court promptly fixes the issue), and it has room for more than two (quite similar I may say) viewpoints in representation.
Maybe.
But please don't call “Yea, it's funny what having open and fair elections can do for a country.” an argument.
themafia 7 hours ago [-]
Please don't take one sentence out of a larger context and pretend it represents the argument.
Which, again, you've managed to completely ignore.
The argument, ironically in black and white, so you can sense it, "this isn't a black and white scenario and seeing it as China vs USA blinds you to the complex differences and global geopolitical forces involved."
I get that you don't personally like America, for whatever reason, but you've blinded yourself to sense in your rush to convey your rather negative and absolutely common sensibilities.
In the last 40 years, China has been building while the US has been wasting money and lives fighting wars. Can we learn to really put America first for once?
delfinom 13 hours ago [-]
Yea, I really don't see how much longer the US economy can hold on. The baby boomers are working overtime to rob multiple future generations of opportunity to feed their profits now.
The formerly "fiscal conservatives" that I know are working overtime explaining how the debt isn't a bad thing and we can just move numbers.
xienze 10 hours ago [-]
> The formerly "fiscal conservatives" that I know are working overtime explaining how the debt isn't a bad thing and we can just move numbers.
Sounds like they're just catching up to what Democrats always used to say whenever a Democrat was in the White House and some Republican would complain about the national debt. "A government isn't a household, debt doesn't work the same way, you don't get it."
59nadir 9 hours ago [-]
That's interesting, because I thought it was common knowledge that Republican presidents actually add more on average to national debt...?
zrtac 14 hours ago [-]
That is the talking point of OpenAI and a16z's super PAC:
"Build American AI, a nonprofit linked to a super PAC bankrolled by executives at OpenAI and Andreessen Horowitz, is funding a campaign to spread pro-AI messaging and stoke fears about China."
In reality Xi has warned of AI bubbles. If China was really pushing it they'd be equal or ahead because so many researchers are Chinese anyway. Instead, China is building real stuff instead of focusing on hot air like a16z ("crypto", "AI", you name it). Maybe China should sponsor that PAC to accelerate the demise of the West.
aurareturn 13 hours ago [-]
They wouldn’t be ahead because they can’t buy Nvidia compute racks anymore and they don’t have EUV machines.
Blackwell is 10-20x more efficient than H200. Vera Rubin is expected to be several times more efficient than Blackwell.
I do wonder how most Chinese employees at OpenAI and Anthropic feel about their employer constantly spreading anti China propaganda to decrease competition. Perhaps money solves almost all things so they go along with it.
coliveira 11 hours ago [-]
This is the next phase of the OpenAI deception: give us as much money as we want or you'll be labeled anti-US and pro-China (guaranteed by the propaganda arm of openAI).
lenerdenator 13 hours ago [-]
> while Western Democracies are afraid of the future, and of their own shadow.
Well, yeah. This is a technology that has the potential to make large chunks of the population unemployed.
Chunks of the population that took on debts prior to late 2022 with the understanding that there would be a way to pay those debts back with their labor.
blowscum 13 hours ago [-]
> Chunks of the population that took on debts prior to late 2022 with the understanding that there would be a way to pay those debts back with their labor.
I’m calling it now, the future is indentured servitude.
watwut 10 hours ago [-]
American companies are selling tokens on a loss for years now. Where is that alternative universe in which America is not subsidizing this?
Selling under price to capture market was American playbook for last 20 or more years.
clear-octopus 10 hours ago [-]
[dead]
huflungdung 13 hours ago [-]
[dead]
ufish235 14 hours ago [-]
What the fuck are you talking about - have you seen what data centres are doing in the West? Do you want more of that?
infecto 14 hours ago [-]
I have not fully seen or appreciated most of the negativity. Obviously there are exceptions to that but in my eyes it has largely exposed how vulnerable the west is due to poor infrastructure constructs and a lack of building out generation and transmission.
arjie 13 hours ago [-]
To be honest, I’m sort of annoyed that the datacenter around the corner from my home closed. It was a five minute walk on 3rd street and I know of it because we used to have so many cages there 15 years ago. Now I have to drive to Fremont.
Nifty3929 14 hours ago [-]
Yes, and yes!
stavros 11 hours ago [-]
What are data centers doing? I'd never heard of anybody having had a problem with them until about two months ago.
bryanlarsen 14 hours ago [-]
Yes, I want cheap clean power.
stuaxo 14 hours ago [-]
Nope.
We have exported production to China in many things, we forget that we had dark satanic mills of our own.
habosa 7 minutes ago [-]
This is shockingly cheap, and by all account it's a very smart model. Is there a US-based provider for DeepSeek V4 Pro that offers a similar cost? I want to use this at work but can't justify sending company data to Chinese servers.
Amekedl 17 minutes ago [-]
DeepSeek rules. I'm using it to do stuff that's not too big in scope, because I still need to remain in charge. Even for this, western competitors have no chance, least Anthropic and OpenAI, plus Gemini also has gotten too expensive besides flash (which is arguably just great, too).
With this, I am sticking to deepseek-v4-pro entirely.
revolvingthrow 14 hours ago [-]
Amusing that just when the big three AI providers from US raise prices significantly, even for the mini models, you’ve got a Chinese model slashing their already-cheap offer by 75%. Not to mention you can run this model on your own hardware, although admittedly even the flash stretches the meaning of local for individual people.
skybrian 13 hours ago [-]
My guess is that the popular US providers get a lot more traffic and are supply-limited. No point in lowering prices unless you can serve the traffic that will result.
Aurornis 10 hours ago [-]
Nothing weird about it. It’s all supply and demand.
The US providers are at capacity limits and are increasing pricing as demand increases.
The Chinese providers are relatively unknown and not even allowed for a lot of applications. They have to cut the price just to be attractive.
arbuge 7 hours ago [-]
Can they actually make money at these prices?
elcritch 6 hours ago [-]
Yesterday I did some testing on the cost to solve the same simple problem on openrouter with different models using cline. Simple problem but it had a few nuances to solve it properly and so required reasoning.
After reading comments like this I was expecting (hoping?) that DeepSeek or similar would be cheaper.
However I was surprised that DeepSeek v4 cost about 5.5x GPT-5.4 to solve the problem.
That doesn't sound right. Were you using the actual Deepseek provider? The one time I spent 3 dollars on Deepseek in a day, I had 615k output tokens, 96M cache hit input tokens, and 5M cache miss output tokens.
nl 3 hours ago [-]
It's not unheard of for "more expensive" models (on a per-token basis) to end up cheaper than weaker models (on a per-task basis).
Kimi K2.5 is roughly double the price (per token) of DeepSeek v4 Pro, but cost $0.05 vs $0.16 (for the same score) on my own benchmark.
Yeah, I struggle to use more than a few dollars a day using Deepseek V4 Pro (max reasoning).
* Some people suggest not using max reasoning due to overthinking and looping issues, this may consume more tokens than needed.
Lwerewolf 13 hours ago [-]
Given that you can run quantized flash on 128g ram, and there's a heavy focus around it (DS4)... I'd say that it's pretty feasible for a decent amount of devs. Never thought I'd buy an MBP but here we are.
n.b. I can't use nonlocal models for a big chunk of my work, so there's that as well.
yogthos 3 hours ago [-]
I imagine electricity costs being a third of what they are in the US in China has a lot to do with it.
gmerc 8 hours ago [-]
IPO metrics juicing is a bitch
MattDamonSpace 11 hours ago [-]
Capitalist competition at its finest
bwfan123 14 hours ago [-]
Kudos to the DeepSeek folks for making tokens not only affordable but also open source. This is a race to the bottom for token costs in a good way.
tomaskafka 7 hours ago [-]
Open weights aren’t open source. Source is the learning data and algorithms, and that is closed.
azinman2 6 hours ago [-]
And this is purely a way to undercut American models. If/once they’re ahead, it’ll stop being the case. Already qwen is doing that.
HDBaseT 5 hours ago [-]
I'm not entirely sold on this idea, open source models aren't really hurting Deepseek or Qwens bottom line.
99.99% of people cannot run these models on their own hardware, they are forced to rent it from someone. That someone is almost always the big China players themselves anyways.
azinman2 3 hours ago [-]
First, there’s manyyyy model inference providers out there world wide. Just look at open router. Second, it’s well known in SV that most startups are using Chinese models because they have access to the weights… and that makes it far cheaper.
Why else is Qwen now having cloud-only models?
HDBaseT 3 hours ago [-]
There is plenty of other inference providers, but tell me, who is the cheapest?
Deepinfra is almost 3x more expensive and they are using a fp4 model, with Max 16.4K output (vs 364K) and have significantly lower throughput!
shimman 3 hours ago [-]
Calling them American owned models implies some sort of public ownership. These are models controlled by individuals whose benefits are absolutely not uniformly shared among the populace.
I mean FFS a single hyper scale datacenter can provide free school lunches for a year. Something tells me the economic output of making sure children are fed is way higher than whether Zuckerberg can own another Hawaiian island by allowing people to be scammed by LLMs.
azinman2 3 hours ago [-]
Not really, it’s a pretty common way to address companies that are part of a bigger geopolitical story. The press will happily refer to Chinese models, European when talking about Mistral, Canadian with Cohere… etc.
I’m an American person yet I’m not public property.
shimman 2 hours ago [-]
The implication is that American models winning would actually benefit Americans. That's not going to happen at all and talking about as if China "winning" would harm Americans is delusional cold war thinking at best.
alyxya 2 days ago [-]
Once they have their own coding agent which they seem to be working towards, I may start predominantly using their models. They seem to be doing all the "right" things, open sourcing models, publishing research, and keeping prices low for everyone.
I'm working on a custom launcher for hooking up Claude Code with various providers (groups env variables in profiles) cause DeepSeek doesn't have vision and sometimes I need browser use with screenshots or Opus reasoning, for other tasks it's fine: https://ccode.kronis.dev/
# After installed (or when run portably with ./ccode)
ccode init-config
ccode edit-config
# Run with default profile
ccode
# Run with named profile
ccode --deepseek
# Set default profile
ccode set-default-profile deepseek
Also turns out that with a local proxy you can get Remote Control working and see the DeepSeek sessions in the desktop app, screenshots on the page. Other than that, I'm happy that it works pretty well and the discount is enough to make me consider going from Anthropic's Max subscription to Pro and using it only where DeepSeek is insufficient. With that proxy I eventually hope to be able to transparently switch models mid-task, if I need Opus for like 5 turns or something.
Overall though I'm not sure exactly how well Claude Code would stack up against OpenCode, since the latter overall feels a bit less hacky with 3rd party models and is even getting niche but nice features like a locally runnable web version: https://opencode.ai/docs/web/
BiraIgnacio 2 days ago [-]
I've been using V4 flash consistently with Claude. Pretty great fast and darn cheap. I use it about 3h/day and so far haven't crossed $1 USD/week.
V4 Pro is between Sonnet and Opus. But it is cheap. Slow but very cheap. Very diligent.
I run a proxy that allows me switching back to Opus when necessary.
Deepseek isn't like Z.ai which is bit cheaper only on the surface. Or like Qwen 3.7 Max which is Opus-level but very expensive.
Deepseek is my favorite since V3 but V4 is definitely catch-up to newer Anthropic models
itsthecourier 2 days ago [-]
thank you so much for sharing ir
rjh29 2 days ago [-]
How does the cost compare using the API vs the $20/month plans with other providers?
I did some back of the envelope calculations and it seems like you would pay $5/month using DeepSeek directly or $15-20 with OpenRouter or similar. But would be interested to hear real world usage.
But as usual, there are far cheaper subscriptions with higher limits than Anthropic and OpenAI, that also provide DeepSeek v4 Pro. So you should use those subscriptions first until you max them out, then look at a different subscription.
iammrpayments 2 days ago [-]
I don’t even use Claude that much and was hitting limits in the 20$ using sonnet, I’ve deposited 5$ with deepseek and haven’t hit the limit after spending 60million+ tokens. So no way it’s more expensive.
nchmy 23 hours ago [-]
The link you shared is just a large table of data, which is hard to browse on a phone.
Could you please elaborate on the far cheaper subscriptions that we should be using?
stavros 2 days ago [-]
I've been using it pretty extensively over a month and I'm at maybe $7. It thinks for quite a while, but the results have been better than Sonnet for me.
maxdo 2 days ago [-]
I'm not curious what tasks you tested it for. Im working on coding agent writing code dynamically on request for customers. i'd say code itself very simple and aggressively cached, and patternalized, e.g. we adding lots of hints to the system.
the only real family models that work were claude and openai, surprisingly, for tasks that needs faster speed, gpt 5.4 is very impressive. Deep seek was very average , doing things somewhere in gemini flash 3.0 domain.
thisisit 2 days ago [-]
I am curious - Is there a way to switch between models depending on the task? Because I believe Deepseek V4 is not multimodal and it will be good to switch back to Claude if vision or other capabilities are required.
mewse-hn 2 days ago [-]
I was looking into something similar because I wanted to test a local model for doing basic coding and smart model (deepseek) for planning.
It's basically not possible with claude code, the api endpoint is a single environment variable and whatever models are on that endpoint are what's available.
HOWEVER, if you run a proxy like LiteLLM, you can configure it to send requests to different api endpoints on the back end and expose them as different "models" on the front end, then configure claude code to switch between those virtual models.
Right that says it has a proxy feature so it can probably do what I was describing with LiteLLM
mvanbaak 2 days ago [-]
Check out the project called superpowers. It can use different models for different agents. I use it witb opencode to have different models for reaearch, planning, execution, testing etc
longsword 2 days ago [-]
There is a tool called deepclaude, which runs a proxy in the background capable of doing this, by simply doing /model in Claude.
maxdo 2 days ago [-]
i've been trying that, in reality every time you try to save it, it's not worth it, the cost of mistake is so high , you can spent 2-3h on just wrong assumption, you lost your time and all the burned tokens.
firecall 2 days ago [-]
It seems you can use the Claude Code CLI harness without a Claude Pro subscription now, which I don't think you could a before?
I've been using Deepseek v4 with Cline in VS Code as a replacement for Github Copilot, and it's not been too bad.
hbarka 2 days ago [-]
The npm install of Claude Code deprecated, since Feb 2026.
Scarbutt 2 days ago [-]
Surprised Anthropic hasn't done anything to restrict Claude Code from using other providers.
cortesoft 2 days ago [-]
At this point in the AI wars, it is probably better to have more users of Claude code rather than restrict which LLMs it can connect to. Claude code is probably (currently at least) stickier than the LLM model itself. Getting people into the Claude code ecosystem is worth it.
Later, they can always lock it down more or add Claude LLM only features to it.
wolttam 2 days ago [-]
The value of Claude Code the harness isn't that great. There's a lot of other good harnesses out there.
rane 2 days ago [-]
I thought so, and then I tried Opencode and Codex and started to appreciate Claude Code a lot more. They've actually done great work with the small details.
intuxikated 2 days ago [-]
I actually have't looked back since trying opencode
The ability to properly see what the agent is doing in tool calls and subagents is really unmatched, CC strips all reasoning and return values, only displaying tool calls, and you're unable to expand a single subagent, it's expand everything and scroll endlessly or show everything collapsed with basically no info at all (read x files, ran x commands)
Just seems like extremely basic features are missing
You can check my profile for which one I like most :) I do think there have been efforts to benchmark different harnesses.
Personally I'm not going to choose one harness or another based on +/- a few percentage points in a benchmark. I'm going to use one the one that I find the most ergonomic, that isn't too bloated, etc. The models are the primary lever, not the harness.
koolba 2 days ago [-]
Good or better? Curious which would be in either bucket.
wolttam 2 days ago [-]
Probably a matter of taste. I prefer the harness I wrote, I don't want to go near Anthropic's bloated mess of a harness with a 10-meter pole.
odiroot 2 days ago [-]
IMHO the ergonomics of their tooling are not great. I'd rather use Codex or even OpenCode.
Configuration alone is very arcane with lacking documentation. Sandboxing/permission system is quite confusing too.
HWR_14 2 days ago [-]
It went the other way, you can't use other harnesses to connect to the cheaper versions of Claude. So clearly they think their current moat is Claude Code use, not the LLM itself.
jdasdf 8 hours ago [-]
I'm my experience claude code is kind of shit.
Pi works very well with deepseek though
wiradikusuma 2 days ago [-]
That's interesting. I thought Claude Code is not as good, therefore people want to use Claude model with other alternatives. This is the other way around.
Which begs the question, regardless of the model, which Claude Code alternative is better? (I keep saying "Claude Code alternative" because I don't know the term... LLM CLI?)
flexagoon 2 days ago [-]
AFAIK the two most popular open source harnesses right now are OpenCode and Pi. They take a pretty different approach, OpenCode includes a lot of features while Pi is very minimal by design and focused on extensibility, to the point where many people are just asking Pi to write a plugin for itself whenever they want it to have a new feature. I personally like Pi's philosophy more and I think its developer justified the choices really well in his blog post:
Author blocks referrals from HN, weirdly dramatic, especially considering they have 1086 karma here. I wonder what we did to them.
flexagoon 2 days ago [-]
Oh damn, I haven't noticed because my browser removes the referer header. But I think the image on the block page is a pretty good answer to why he did that.
SturgeonsLaw 2 days ago [-]
What's the image trying to convey? Genuine question, I just come here to read nerd stuff and I'm not aware of any controversy
flexagoon 2 days ago [-]
The image shows Garry Tan, the CEO of Y Combinator. He has lately been on a huge AI psychosis streak, bragging about things like "shipping 37000 lines of code every day" and "using Claude Code so much it burned out his USB-C power connectors". He's in a lobster suit because he's talking about OpenClaw, an AI agent assistant which those same AI psychosis types lean into too much by giving it full read-write access to all their life and then getting surprised when it accidentally deletes all of their emails.
Pi's developer is obviously not anti-AI, and he definitely doesn't hate OpenClaw, since it's based on Pi. But there's a growing number of people who take those things too far, and a lot of them are on HN. You can easily find them in the comments of any AI-related post here. I assume that's the type of people the image is portraying.
2 days ago [-]
wrs 2 days ago [-]
The common term for a tool that wraps an LLM with a workflow is “harness”.
jijji 2 days ago [-]
I've seen good results with opencode connected to glm 5.1 on ollama cloud... for $20 a month you get similar performance that you get with opus 4.7
copperx 2 days ago [-]
I love oh-my-pi, but I'm not sure if it's "better". Maybe just as good.
g023 2 days ago [-]
I use DeepSeek v4 flash with CoPilot and it works pretty good.
2 days ago [-]
LaurensBER 2 days ago [-]
It works very well with OpenCode. My team keeps hitting the 5h limits on other subscriptions and it's pretty good to have Deepseek as a backup. I just put 50 bucks on there and it feels like it'll never run out.
It's not good enough to fully replace any of the frontier models yet but it's definitely great to have as a backup!
lambda 2 days ago [-]
Why do you need them to provide a coding agent? Just use their model with any off the shelf coding agent. I happen to prefer Pi, but use whatever works for you.
alyxya 2 days ago [-]
I probably have an unfounded assumption that whatever coding agent they make will work really well with their models, better than external harnesses. I don't have a good sense for how all the model + harness combinations compare, nor any good way to compare them myself, but generally believe model companies train their models to work best with their own harness.
wolttam 2 days ago [-]
I've noticed that models have gotten less finicky with this over time. Harnesses don't need to be complex to get good coding performance from models, they just need to implement some sane primitives for code exploration and editing.
wyre 2 days ago [-]
It is in the model's provider's interest for you to believe this because they get to lock you into their harness and inference. As models get better they will get better at using any harness, it comes down to how well the harness is actually engineered. I highly recommend you take an hour or two and check out Pi to either solidify or change your assumption. The harness is essentially just another developer tool and can be as opinionated, overly-engineered, minimal as anything else. I would think for DeepSeek, especially, they're efforts are much better spent researching how to make their LLM's better instead of working on engineering a harness that might get some marginal gain building it for their models.
Yeah, I'm using Pi with their models through an OpenCode Go subscription and it works pretty well. 10 bucks and V4-Flash is virtually infinite.
apitman 2 days ago [-]
What's the best way to use it with Pi, OpenRouter?
schaefer 2 days ago [-]
> What's the best way to use it with Pi, OpenRouter?
I can't claim it's "the best"...
But the Pi.dev and OpenRouter combo is what I'm doing at home, and I love it.
Setup was easy, I can use /model to switch between any of the openrouter models and whatever I'm hosting locally via VLLM.
brianwawok 2 days ago [-]
Open router is a 5% tax? If you use it seriously may as well skip it
schaefer 1 days ago [-]
I don't have an LLM-positive culture at work.
I'm on a bit of an island. Or under a rock.
Anyhow, I'm pulling myself up by my own bootstraps.
For me a 5% overhead is fine... if it gives me better visibility of this rapidly moving field.
lambda 2 days ago [-]
I only use local models myself personally. But yeah, OpenRouter would probably be a good option.
lofaszvanitt 2 days ago [-]
Qwen cli
satvikpendem 2 days ago [-]
RL with the harness inputs and outputs of users is one of the primary improvers of model performance, a self perpetuating flywheel.
smoe 2 days ago [-]
Earlier this week I started testing Chinese models on my codebase. I haven’t really looked at interactive coding yet, but more at issue triage, bug auto-fixing, log analytics, etc.
I used DeepSeek, Kimi, GLM, Qwen, and MiMO against GPT-5.5 high as reference, all running in Pi harness without anything installed.
So far, Kimi and MiMO look the most promising to me. I haven’t tested them rigorously enough to make a strong statement, but my first impression is that, in practice, all those models may be less behind on typical daily tasks than people think.
They are a bit “work hard, not smart". Getting to same-ish results more slowly and using more tokens, but at a fraction of the price
try-working 2 days ago [-]
I just did a little comparison using benchmarks for GPT 5.1 through 5.4 to map out the equivalent capability-level of some of the Chinese models.
Based on these benchmarks, here's a rough mapping:
- Qwen 3.7 ~= GPT 5.3
- Kimi K2.6 ~= GPT 5.15
- DS V4 ~= GPT 5.1
So yes, we have GPT 5 at home now. No need to pay the Legacy Labs anymore.
I switched to predomentantly using mimo this week, mostly out of curiosity to see how dependant I was on frontier models. Honestly I cant really tell the difference. I would say I work on pretty average codebases with well know frameworks doing pretty typical things and initial impressions is that mimo, kimi and deepseek can probably handle what I need more or less the same as gpt5.5 or claude.
2 days ago [-]
c0rruptbytes 2 days ago [-]
I personally really like DS4 Flash - it's the largest I can run locally with decent speeds and I feel like it's good enough to maintain a codebase with less effort
r0b05 2 days ago [-]
What hardware and quant do you run it with?
maxdo 2 days ago [-]
maybe i need to give it second chance, surprisingly Kimi 2.6 consistently fail even to generate valid json plan, where gemma 4 was doing really good, but slow.
JSR_FDED 2 days ago [-]
Are you going through OpenRouter or direct? I’ve had nothing short of excellent results from Kimi.
15 hours ago [-]
jdboyd 2 days ago [-]
I would prefer a coding agent to be somewhat independent of the model provider. Providers are trading off on quality, features, and price so frequently, and I don't want to keep changing my agent every time.
I am looking forward to things slowing down and stabilizing. I'm not saying that should happen today, just I am looking forward to it.
gaolei8888 2 days ago [-]
I think this will happen much sooner than we thought. Maybe it will happen in next 6 months
akritid 2 days ago [-]
You can take Codex today and ask it to rewrite itself to work with any API
hawtads 2 days ago [-]
There is OpenCode and Pi, they both work pretty well
azinman2 6 hours ago [-]
And not letting you opt out of being their training data.
tequila_shot 2 days ago [-]
You no longer need "their coding agent". You can hook up claude code to use Deepseek. Works perfectly.
minimaxir 2 days ago [-]
Zed's Agent natively supports a DeepSeek API key now. (do not use it through OpenRouter if you want to save the most cost)
potsandpans 2 days ago [-]
Give pi a try if you haven't already. Avoid vendor harness lock-in.
vinhnx 2 days ago [-]
You can use DeepSeek with my coding agent VT Code. Recently I've added DeepSeek V4 Pro and DeepSeek V4 Flash support with all providers, via: Official DeepSeek API, HuggingFace, Ollama Cloud, OpenRouter providers.
antirez's ds4-agent works quite fine. It runs on any Apple Silicon device with 96GB RAM or more.
rjh29 2 days ago [-]
I wonder how many years it'll take for the API token cost to exceed the money spent on ram.
zozbot234 2 days ago [-]
The DS4 folks are unofficially testing ways to run the model with lower performance on lower-RAM machines. Similar efforts are going on with llama.cpp. The results are a bit of a challenge, prefill time tends to explode which is a limitation if you care about agentic workflows.
vrganj 2 days ago [-]
Anything that runs with 64?
zozbot234 2 days ago [-]
You can just try it yourself, it will probably run with a heavy slowdown using SSD offload.
raincole 2 days ago [-]
All the major coding agents already support DeepSeek.
cultofmetatron 2 days ago [-]
open code works with them today. I've been using it fulltime for 2 weeks so far.
sunaookami 2 days ago [-]
Using it with Pi and can only report good thing so far. I'm very impressed by how good it is (also it's way slower than Claude Sonnet and GPT-5.5 and often thinks "too much" before starting).
teekert 2 days ago [-]
Why not OpenCode? Genuine question, not an expert..
2 days ago [-]
ReptileMan 2 days ago [-]
Both pi, opencode and zed work amazing with deepseek.
Guillaume86 2 days ago [-]
You seem to have tried a few things, if you don't mind I have a few questions as someone currently on Claude Code but would prefer to not lock myself in a commercial ecosystem (and their pricing change regarding headless usage is annoying me):
- how do/would you add the WebSearch tool to your harness? pay for a separate service or does deepseek offer something with their subscriptions?
- do pi/opencode support pasting images in prompts?
- how do you handle reading images? deepseek is not multi modal IIRC? do you pay for another model and route to it?
Any of these missing would really annoy me in day to day use...
wyre 2 days ago [-]
Brave, Exa, and Tavily all offer a free tier for websearch, after that it comes out to like 1¢/search, very easy to ask pi to build a web search tool using any of these providers.
They support image locations like a file or url, but not regular images (opencode desktop might though?)
Both pi and opencode make it very easy to change models so you can easily call to 5.4-mini or whichever multi-modal LLM for reading images. I'm sure you could even create a skill to automate the process too, having the model use the cli to send the photo to the multi-modal and give it back a description.
ReptileMan 2 days ago [-]
I use them for pure coding, but I think they do curls when needing something from the host machine.
Guillaume86 2 days ago [-]
Yes I'm also using it for coding: I often make the agent use WebSearch in the research phase when deciding on a stack or a library or research best/modern practices to do achieve something. As for images I find it super useful to be able to paste snipped screenshots to show the agent when something is wrong in a UI/frontend or just something I can't copy paste easily.
linzhangrun 2 days ago [-]
there already is a open-sourced deepseek-tui coding agent.
besides, you can always connect to opencode.
jack_pp 2 days ago [-]
i have done some amazing things for 5 dollars, using opencode. give it a shot, it is incredibly cheap
vinhnx 2 hours ago [-]
DeepSeek's KV cache is impressive and very cost-efficient for long-horizon tasks. I tested on VT Code with DeepSeek V4 Pro, and the cache-hit ratio is high. *I build a coding agent and have recently improved and hardened DeepSeek V4 integration. I registered the DeepSeek API key, topped up just 2 USD, and used just DeepSeek V4 Pro and Flash with max thinking. So far the most visible improvement in both cost and context is the cache improvement; it's quite impressive with this announcement of a permanent price drop. (https://xcancel.com/vinhnx/status/2058748305350557932). Currently, my usage is still at $1.15 after 2 full weekends.
wg0 2 days ago [-]
If you have not tried DeepdeekV4 you're missing out. The pricing makes it unbelievably good.
The chains of thought for Deepseek are very very interesting reads. Open code won't show them but do read them and you'll be surprised at how underrated the model is.
My model usage is very low but I still do pay directly to Deepseek regularly as my tribute and contribution to them open sourcing their models as my gratitude and showing support for what I deem positive for overall social good.
abyssin 2 days ago [-]
It’s good and cheap, but don’t talk about politics to it or it might trigger some sort of censorship rule. You can see it think, then suddenly erase everything and suggest to switch to another subject, without explaining anything. I also had it output some sort of generic message about how the news outlets are in the service of the people. Both times I was surprised because I didn’t make any sensitive requests, neither illegal nor subversive. But it was a remotely political topic and it was enough. There was something both chilling and refreshing about it, since censorship in the west is usually more subtle.
pton_xd 8 hours ago [-]
How is the censorship when asking for "dangerous" actions like writing a port scanner?
ux266478 2 days ago [-]
The base model doesn't have these problems FWIW
cosmojg 2 days ago [-]
How are you running the base model?
ux266478 2 days ago [-]
vLLM in a docker container, FP16 quantized on an 8x MI300X cluster. Very lazy hackjob, I didn't even set up an interface. Was constructing curl commands from string templates. I worked out if I paid that compute cost over a whole month, it was twice as expensive as the monthlies you'd pay for owning a very nice 2000sqft non-coop apartment in Midtown Manhattan. I was paying rock bottom prices, too.
joewhale 2 days ago [-]
do most people use llms to chat about politics?
true_religion 12 hours ago [-]
I once was doing unrelated work to politics, but Qwen3 said that my input had homonym characters to those in the name of Xi Jimping. It gave the strangest response twice before finally admitting the source of the error and allowing me to convince it that no, I was not trying to intentionally misspell the man’s name.
customguy 2 days ago [-]
"do most people use llms to chat about [specific thing]?"
No, of course not, why do you ask?
tequila_shot 2 days ago [-]
Yes - the model is REALLY good. I try Claude at work and Deepseek personally and this is the only model that works without trying to actively bankcrypt me.
seemaze 2 days ago [-]
Perhaps unintentional, but I find 'bankrypt' to be a thoroughly interesting portmonteau.
I'm not sure if it's when you run out of crypto, or when your bank gets hit by ransomeware.
jeffadelic 2 days ago [-]
Ironically you spelled portmanteau incorrectly. OP very well could have made a similar error for bankrupt. Maybe not, interesting to think about.
aqfamnzc 2 days ago [-]
I'll be honest, I carefully scanned your comment for a similar mispelling.
seemaze 2 days ago [-]
its misspelling all the way dwon
2 days ago [-]
SyneRyder 2 days ago [-]
I thought of it as crypt in the sense of "underground vault that acts a a burial place". So, not just ensuring you're bankrupt but with maybe a chance to start over, but "bankrypt", so bankrupt that they make sure you're buried.
Either way, something interesting about that accidental misspelling. It will probably become someone's band name one day.
2 days ago [-]
CryptoBanker 9 hours ago [-]
Opencode absolutely will show you. You just have to toggle “Expand Reasoning”
intuxikated 2 days ago [-]
Reasoning display can be toggled in opencode
cassianoleal 2 days ago [-]
I live V4 Pro for certain things but I've been quite impressed with V4 Flash for coding. It's terse, to the point, tends to make few mistakes and is pretty fast.
inhumantsar 12 hours ago [-]
same here. gpt-5.x medium was my default for coding and v4 flash (max) has completely replaced it. it's the first open source model that made me feel like I could just let rip and not worry any more than Claude or GPT.
When planning small-to-medium sized changes, I found that it was a little bit faster than GPT-5.5 (high) and produced equivalent results. on large changes its results were fine but GPT's were more thoroughly thought through. DS v4 beats the absolute pants off GPT when it comes tone and style though.
schmorptron 2 days ago [-]
i see the reasoning traces in opencode (cli). maybe it's a setting?
maltalex 2 days ago [-]
This looks suspiciously cheap.
The same model hosted by other providers is much more expensive [0]. So either DeepSeek can host it much cheaper than anyone else, or their business model is different. I suspect the latter, especially since their privacy policy [1] says personal data, including “User Input,” can be used "To improve and develop the Services and to train and improve our technology".
Inference stack efficiency: Many of these providers take off the shelf sglang / vllm / trtllm and hope for the best. Meanwhile DeepSeek team is known for pushing the boundary of optimizations.
Now, sglang and vllm are great pieces of software, but take DeepSeek's Sparse Attention (DSA). Introduced 1.5 years ago (https://arxiv.org/abs/2512.02556), used by DeepSeek 3.2, GLM 5, DeepSeek V4. Only now is it slowly strating to get optimized in the major inference engines: (https://github.com/sgl-project/sglang/issues/19380https://github.com/sgl-project/sglang/pull/22851 etc.). Of course, DS V4 adds extra optimizations into the model architecture on top of DSA, and those will take more time to be taken full advantage of by the open source inference engines.
Privacy: Betting that people will pay extra for inference hosted outside China. This is especially true with DeepSeek, because DeepSeek is transparent about using API data for model improvements.
And few other things (scale (matters a lot for MoEs), reliability, soft enterprise lock in, etc.)
---
There is also, likely, tacit collusion at play here. Look at GLM 5 and GLM 5.1 prices. GLM 5 and 5.1 cost the same to run, but providers decided to charge much more for 5.1 because it is much better model, and because Z.AI raised their price as well.
gpugreg 2 days ago [-]
Another factor is that DeepSeek is not just doing inference, but also training models, so they can use underutilized compute nodes for training during off-peak hours, as described in their DeepSeek v3 article: https://github.com/deepseek-ai/open-infra-index/blob/main/20...
But I agree that the main driver is that they are really good at optimizing. They will have chosen their architecture in such a way that it will be as efficient as possible on their own infrastructure, so they have a massive head start. Inference framework developers still have to catch up.
SyneRyder 2 days ago [-]
Probably a dumb question, but looking at OpenRouter, are there really no providers outside of the US, Singapore and China offering DeepSeek? It seems like such an obvious thing for a European or other Western provider to offer. I'm sure it's a quantum leap ahead of Mistral.
I'd love to give these models a try, but I'd rather not use a provider that trains on or stores my data (beyond standard legal requirements of course).
polski-g 1 days ago [-]
Crof.ai
SyneRyder 1 days ago [-]
Just checked, Crof.ai links to "Nahcrof LLC", and the terms and conditions say "These Terms are governed by the laws of the United States."
Though to be honest, I'm not sure I want to trust business workflows to a website where the only contact is a Gmail address and no physical contact address. That site looks incredibly dodgy.
raincole 2 days ago [-]
They're selling at a loss (obviously).
But why not? Gaining market share at a loss isn't the US's patent.
missedthecue 2 days ago [-]
They haven't raised enough money to be selling at a loss. And selling at a loss to gain market share in an industry with zero switching friction between sellers is not a strategy. That doesn't make sense.
Loss leading only works when
- it leads to a situation that allows you to prevent competitors from selling to your customers (gilded age railroad and pipeline industries are great examples). Then you can eventually raise prices and not lose back any market share.
- or when it allows you to remarket to customers and make back the difference (selling a single console at a loss to sell a whole library of high margin videos games, or selling jet engines at a loss to lock in 30-year maintenance contracts).
raincole 2 days ago [-]
Yeah, cool theory, but they are selling at a loss. We know that because their model is open and available on other providers too. No other provider even sells a quantitized version of DeepSeek V4 Pro at that price.
Also, in case of LLM, market share = more people uploading their whole codebase/legal documents/unfinished books/literally everything to your servers for you to use in future training. So the incentive to sell at a loss is much stronger than other kinds of service.
freakynit 2 days ago [-]
We are missing the fact that they have created their GPU's that are now just 4-5 years behind. And considering it's China, which does everything-hardware at insane scale, and efficiency, my guess is that they are at step-1 now... gain market share at loss, and at the same time, gradually, start plugging their in-house cards to power these models to gauge their performance on real workloads.
Once they cross a certain threshold, nVidia can say goodbye to it's monopolisitic profit margins of over 70%.
GPU infra capex is the biggest spend for the inference providers as of now, power, second biggest.
China has already cracked the power part, they are now close to cracking the GPU part.
oceansweep 2 days ago [-]
Didn’t the DeepSeek team release a paper documenting inference improvements that showed they were still making a profit even under heavy discount?
Why would it be impossible for them to make a profit now, with a new model and more research?
Before DeepSeek, no one sold cheap tokens anyways and then DS showed the profit margins.
WithinReason 2 days ago [-]
they might have trained the model with fancy optimisations that only they can unlock
missedthecue 2 days ago [-]
[dead]
throwburn202605 2 days ago [-]
Maybe Anthropics efforts to thwart deepseek from distilling their model is bearing fruit.
So their strategy now is to try get as much raw content for their inference. You're being "paid", via discount, for your use
lejalv 9 hours ago [-]
> So their strategy now is to try get as much raw content for their inference. You're being "paid", via discount, for your use
There is an implicit social contract, and for many it might work out well:
We use your data to improve the model. You get to use the improved model for affordable prices and (the important part): you get _the model_.
throwa356262 12 hours ago [-]
From Antropics own report:
"DeepSeek
Scale: Over 150,000 exchanges"
Doesn't sound like much of distilling. Maybe they are runnung benchmarks?
clear-octopus 9 hours ago [-]
[dead]
amazingamazing 2 days ago [-]
Proof?
d4ust 2 days ago [-]
You may not know enough about DeepSeek founder Liang Wenfeng, who is also the founder of High-Flyer Quant
mik09 19 hours ago [-]
[dead]
minimaxir 2 days ago [-]
I'm more curious about the caching:
> (2) For all models, the input cache hit price has been reduced to 1/10 of the launch price. This price adjustment takes effect from 2026/4/26 12:15 UTC.
There is no end date. Currently, it's 2% of the input price for DeepSeek V4 Flash and 0.8% with this new V4 Pro pricing, which is extremely low compared to competitors to the point that it affects the unit economics a bit and I thought it would be temporary.
In the case of V4 Pro, the effective cost is ~$0.04/M input tokens given the caching (based on OpenRouter's metrics: https://openrouter.ai/deepseek/deepseek-v4-pro), which is significantly cheaper than even small models from competitors.
Palmik 2 days ago [-]
DeepSeek V4's KV cache is very efficient due to its heavily compressed and sparse attention architecture.
DeepSeek V3.2 which uses DSA only (sparse attention, but without compression from HCA and CSA) is a smaller model but uses 10x more memory at 1M context window compared to DS V4 Pro.
Also, I have to say, DeepSeek's API has a very good cache hit rate. With the same workload, I see ~80% KV cache hit rate with the DS API vs ~50% with the major western inference providers for open weight models.
wolttam 2 days ago [-]
A big point of DeepSeek V4 is the significantly reduced KV cache size.
maxdo 2 days ago [-]
Flash on it's own is not a very competitive model, it's pricing is within ranges of everything else on the market.
Probably the most direct competitor of Flash model :
GPT 5.4 mini
Cache Read
$0.075
/M tokens
Gemini 3 flash :
Cache Read
$0.05
/M tokens
e.g nothing very magical or ground breaking.
freehorse 2 days ago [-]
Cache read for dp4-flash is $0.0028 /M tokens, which is more than 10 times cheaper (and also much cheaper for cache miss and output tokens).
Have not actually compared it to other models, but I would not consider it in the same price range.
maxdo 2 days ago [-]
this price only available if you ok to send your data to Beijing Volcano Engine Technology Co. for the rest open router vendors it is not the same.
csunoser 10 hours ago [-]
Not sure why you are downvoted, this is essentially correct (assuming Volcano Engine tech refers to Deepseek as provider).
maxdo 2 days ago [-]
Sonnet :
Cache Read
$0.30
Gemini 3.5 flash :
Cache Read
$0.15
minimaxir 2 days ago [-]
For Sonnet, that's 10% of input cost (and requires paying for the cache)
For Gemini 3.5 Flash, it's also 10% of input cost.
Which is why 2%/0.8% change the economics in a meaningful way, given the input/cache-heavy way agents operate.
throwdbaaway 2 days ago [-]
And their disk-based caching is amazing. I got a long 700k context session spanning more than a week, with pauses in between that was longer than a day, and some rewinds mixed in as well.
Stats from pi:
↑400k ↓438k R432M 71.9%/1.0M
Half a billion tokens, $2.12
kingstnap 2 days ago [-]
Anthropic's caching requires you to pay a $0.75/Mtok for Sonnet and $1.25/MTok for Opus as a surcharge on top of the original input token cost. It's not even automatic.
If you are reading ~8 times (8 total back and forth tool calls) that means that cache reads in some sense cost ~$0.4 / M toks (Amortizing the write surcharge over all reads).
It's really quite ridiculously expensive considering what you are paying for is some residence on a VRAM that sometimes gets offloaded to NVMe.
maxdo 2 days ago [-]
GPT 5.4
Cache Read
≤272K
$0.25
And it's multi modal, and available at whatever you might imagine rates limits.
syntaxing 14 hours ago [-]
It’s wild. Regardless of Deepseek direct pricing, on Openrouter itself, the pricing for Pro is comparable to Haiku. Flash is even cheaper. You get Opus 4.5 and better than Sonnet 4.6 performance.
michaelbuckbee 10 hours ago [-]
I was curious just how much of a difference there was, so ran a quick eval comparing them and fwiw DeepSeek is considerably slower but much much ~5x cheaper than Haiku and fwiw ~35x cheaper than Claude Opus 4.7.
It can't see images so it doesn't do everything Sonnet will do. Still a good deal though.
eikenberry 11 hours ago [-]
You can specify the provider on Openrouter to only use Deepseek and get the cheaper pricing.
syntaxing 10 hours ago [-]
But worse data retention and training policy. It’s a balance.
chrisweekly 13 hours ago [-]
Could you please clarify exactly what you mean?
syntaxing 13 hours ago [-]
If you look at openrouter for Deepseek model providers that does not use your data to train, it’s still significantly cheaper than Anthropic pricing. The Pro and Flash performs closely to Opus 4.5 and Sonnet 4.6 respectively (though no vision capability which is a fair thing another user called out). The pricing of Pro is close to Haiku. The pricing of Flash is 10X cheaper. To put into perspective, you can have Sonnet 4.6 capabilities at 10X cheaper than Haiku even without “Chinese government subsidies”.
I am more worried about accidental data leak (agent reading env file for example) with the Chinese hosted models compared to the US hosted models. Am I wrong to suspect that the Chinese government might be more likely to scan all chats and save useful information compared to the US government or company?
I hesitated to even post this comment as it sounds biased and xenophobic. I would love for someone to convince me I am wrong. Does anyone have any insight into the company behind deepseek hosting, and what their history of respecting data privacy is?
lejalv 9 hours ago [-]
Yes, you are wrong, and yes it is xenophobic, and no it won't stop because you are too afraid to fall from your Hollywood-induced exceptionalism.
Where were you when ... everything happened? Keywords: Snowden, five eyes, FISA, PRISM, ...
Laws in the US are irrelevant. And Google has much more sensitive data to cross with any inputs you give them than Chinese companies. Also the extraterritorial executions, coups, etc. are the US specialty. So yes, you're wrong, and it comes across as xenophobic (fear of the strange or foreign).
drstewart 7 hours ago [-]
[flagged]
3s 2 days ago [-]
It's not an unreasonable concern, which is why most US companies prefer to go with AWS bedrock, or even one of the AI labs, and typically request zero data retention agreements. But leaking is a concern no matter where it's hosted, it's just the incentives that change IMO. For example, the labs do scan every chat and train on data not covered under enterprise ZDR agreements. Law enforcement can request access to all user data with a valid warrant or in an emergency context [1]
If you're interested in trying DeepSeek V4 privately, you can try Tinfoil (tinfoil.sh) where all models are hosted in an attested secure hardware enclave, making the inference end-to-end private. Full disclosure: I'm one of the cofounders.
does your pricing take account of cached vs uncached token?
conception 1 days ago [-]
Your pricing faq says “all models are listed above.” They are not. :)
throawayonthe 8 hours ago [-]
"Am I wrong to suspect that the Chinese government might be more likely ..." yes you are
the US is known to do dragnet surveillance; yes it's likely China might, but we don't know if it's valuable enough in this instance
anyway deepseek is open about using this data for training, therefore it is stored and could be searched if someone really wanted; so do the western providers (even when you opt out, at least on the non enterprise plans, most "store for up to thirty days for compliance or LE reasons" lol)
wkcheng 2 days ago [-]
Just use it through something like Azure. They host the entire model and serve it from the US. I'm sure that there are other providers like this.
We use it that way and it works great.
rsanek 2 days ago [-]
You don't get the cheap pricing this way, which is why people are so interested in the model in the first place.
opsnooperfax 2 days ago [-]
I would not be shocked if they do that. I would not be terribly shocked that the US-headquartered models do that for another government either. As far as data confidentiality goes, I wouldn’t hold my breath. Microsoft checks all those enterprise boxes, right? Yet, Azure still gets breached once in a while.
dualvariable 2 days ago [-]
I'm not important enough for anyone in China to go out of their way to attack me. And DeepSeek has to maintain a sufficient level of trust so that users keep using their platform--they can't just act like a keylogger attacking everyone's crypto wallets or trust collapses.
If I was working on something that the Chinese government considered of strategic importance, then I would certainly be worried about it. But I don't do that.
I'm much more worried about techbros in this country using their LLMs to extensively profile me and produce something vastly more dystopian in this country than the real or imagined social credit scores in China. The people trying to convince you that the Chinese government are the people you should be worried about (as an individual in the United States) are probably the people you really need to be worried about.
giwook 2 days ago [-]
I think there is a nonzero chance of that happening. Beijing could at any point decide that DeepSeek has become too powerful and/or is a major export and start to insert themselves (assuming they have not already).
There are widespread reports about how foreign actors (not limited to China) have infiltrated critical networks across many industries in the US en masse and are simply waiting for the right time to exploit them. Frontier models are simply another attack vector (and much more easily exploitable when you think about it).
The fact is that there is potential for this with any cloud-hosted model, whether it is intentional by the actual company building the models or a malicious actor is able to exploit a vulnerability.
jug 2 days ago [-]
This is a risk although then this is fortunately a model that isn't tied to Chinese hosting. But indeed something to consider if using straight DeepSeek.com.
jdgoesmarching 2 days ago [-]
More likely? US tech leaders have been fully capitulating to the surveillance state for over a decade. Why do I care what China does with my data? I don’t live in China and never plan to.
The tech bro threat model has always been pure jingoism and xenophobia. Ironically, the worst thing a Chinese company has done with my data is sell Tiktok to an American technofascist.
cumshitpiss 2 days ago [-]
[dead]
nivekney 2 days ago [-]
User data integrity definitely should be a concern. It's also known that regulations is being outpaced, so the cost of being/using frontier products is a double-edged sword for sure.
WarmWash 2 days ago [-]
[flagged]
throawayonthe 8 hours ago [-]
> In fact there isn't even a concept of "keeping the government in check"
> Xi Jinping has never been over ruled because that isn't even a thing that can happen there.
this is what the median american voter believes lol
lejalv 9 hours ago [-]
I really like the part where Trump has been prevented from pardoning violent rioters that caused deaths in an attempted coup d'état. Super impressive, great democracy.
megous 5 hours ago [-]
Wake me up when US presidents start going to prison for killing hundreds of innocents, or starting wars of aggression, or supporting appartheid regimes, or genocides, or insider trading, or targetted economic destabilization of other countries, or stealing, or persecution, or any number of other things normies would go to prison for.
US presidents are prevented from nothing, when it comes to what they do to non-americans. And you're telling me, they'd stop at not reading my claude convos? That's where the red lines is? Lol.
lofaszvanitt 2 days ago [-]
Ye but they are on aws.
WarmWash 2 days ago [-]
If deepseek is the customer being billed by AWS, they can do whatever they want.
Reubend 2 days ago [-]
Props to them. That makes DeepSeek v4 Pro extremely cheap compared to others, even in the same category. Look at these prices per million outputs tokens:
DeepSeek V4 Pro: $0.87
Qwen 3.7 Max: $7.50
Grok 4.3: $2.50
GLM 1.5: $3.08
Opus 4.7: $25.00
GPT-5.5: $30.00
Arcuru 2 days ago [-]
It's actually even cheaper when you look at the cache read costs. Those costs can dominate in agent workflows and DeepSeek's cost for cache reads is insanely low comparatively. At $.003626/M tokens, the cheapest other thing on your list is >$.2/M tokens. That's on the scale of 100x cheaper.
freakynit 2 days ago [-]
Also, deepseek cache hit rates are pretty good. I use deepseek v4 flash model regularly for agentic tasks (more than 20 tool calls on average per run), and 70%+ of input tokens get served from cache.
The speed is absolutely bonkers too. I once misconfigured a mcp I was developing locally, and told it to use the tools provided by this mcp to get certain task done. It figured out that the mcp is misconfigured, and then automatically went ahead and started to fix the mcp, fixed it, and then started using it by passing raw jsonrpc messages using stdin/out, bypassing the harness integration (since it would have needed a restart).
It did all of this in under 30 seconds and made over 15 tool calls in all of this (yes, I use yolo mode in a container, so my agents have full access to everything in the container).
gck1 2 days ago [-]
The next time someone says "stop crying about usage limits, they're losing money on your subscription ", I'm going to link to this comment.
Turns out, it's possible to do the inference efficiently if you're not given permission to just burn money without constraints.
onlyrealcuzzo 2 days ago [-]
And they don't make the model worse once you have a subscription!
It doesn't matter how good Opus is if 2 months into your subscription they make it worse than GPT 3 to save money.
cassianoleal 2 days ago [-]
DeepSeek don't have a subscription plan.
niwinz 1 days ago [-]
OpenCode Go includes it. Pro and Flash
cassianoleal 1 days ago [-]
OpenCode Go is not DeepSeek. They may host the model but they're run by an entirely different organisation.
I imagine when onlyrealcuzzo said "they don't make the model worse once you have a subscription", he didn't mean OpenCode Go, otherwise they would have probably said so.
onlyrealcuzzo 18 hours ago [-]
I meant whoever I'm getting DeepSeek from via Open Router...
2 days ago [-]
marksully 2 days ago [-]
*GLM 5.1
Sphax 3 days ago [-]
That is some insane value.
I've been using GLM Coding Plan Max with GLM 5.1 for a while and i've tested DeepSeek V4 Pro maybe for 3 weeks now and I found it to be better than GLM 5.1 for complex coding tasks. I've used 65m tokens and with that price it cost me $1.5, that's really cheap.
DeathArrow 2 days ago [-]
I think Deepseek uses much more tokens than other models.
ReptileMan 2 days ago [-]
But way less dollars. Which is the important metric.
gertlabs 2 days ago [-]
Even with the V4 Pro discount, the V4 Flash model gives you the best performance per unit dollar, and better performance overall for agentic, tool-heavy workloads. V4 Pro is smarter in one-shot reasoning, but at a significant speed difference. The performance, cost, and speed, makes V4 Flash our top flash model today by far.
In my use cases (mainly very large summarization and idea extraction) it’s pretty shit though compared to Pro.
cold_harbor 3 days ago [-]
their MLA architecture cuts KV cache by ~5-13x vs standard attention. that's why inference is actually cheaper to run, not just a price war to gain market share.
zozbot234 2 days ago [-]
That's also a game changer for local inference. It unlocks long contexts, batched inference and storing the KV cache to disk on ordinary consumer platforms.
vitorsr 2 days ago [-]
Yes. The discount was most likely a "post-market trial" of how efficient the caching works for the new generation models.
trollbridge 2 days ago [-]
I've "adjusted" my workflows now to use the cache. (Basically read all the files in your project very early on in your session, etc., simple stuff like that.)
Nearly all requests are cached now. It's amazing.
hmaddipatla 2 days ago [-]
[dead]
Alifatisk 12 hours ago [-]
I wish they had a coding plan like Z.ai, Kimi, Minimax and Xiaomi (MiMo). So instead of paying per million token, I pay a subscription. At the same time, 75% discount is astonishing. I'll just topup and see how far it goes.
I remember when Z.ai had a deal where I paid 7$ for three months, good times.
rjh29 12 hours ago [-]
Anecdotally it's costing the same or less than the typical coding subscription, due to the discount and the power of context caching.
Great headline cost reduction, but has anyone here actually used the API in production?
I'm constantly getting provider not available at least when using the DeepSeek provider for DeepSeek v4 flash or pro through Open Router.
It seems like there isn't enough capacity to actually serve production traffic
bugglebeetle 14 hours ago [-]
Use their API directly, this is an openrouter issue. I ran something like 5 billion tokens through them directly recently without any bumps in the road.
olcay_ 14 hours ago [-]
I'm using the official API and I've had no issues.
smallerfish 2 days ago [-]
They may be state backed, in which case the loss-leading could be a geopolitical move. It's a useful model regardless.
I'm sure the frontier labs figured out very clever ways to leverage user input and actions as data for training and signals for RL. DeepSeek wants in on the game.
dinfinity 10 hours ago [-]
Agreed. The amount of effectively annotated (possibly very sensitive) data that users are voluntarily shoving across the line seems worth losing some money over. I imagine that data is also not exactly safe from the Chinese government.
jorl17 2 days ago [-]
I've been extremely impressed with DeepSeek V4 flash.
We've been working on a project which can be thought of as an agent, just not for coding. So we've been building everything: agents, sub-agents, RAG, dynamic intent detection, changing models based on what's being done, etc. In our tests, DeepSeek V4-flash is the cheapest model with acceptable replies (few hallucinations, while finding the right information). It's not the cheapest one we run overall (we're actually surviving with 3B models for some tasks), but it's definitely the one powering the system and driving the main "agent".
annonsama 3 hours ago [-]
Interesting thing is, DeepSeek’s parent company is actually a quantitative hedge fund, which might be one reason they can keep their prices so low.
garbawarb 14 hours ago [-]
Right before OpenAI's IPO. The boldness.
vb-8448 11 hours ago [-]
Isn't OpenAI supposed to go public this autumn/eoy?
daniel_iversen 14 hours ago [-]
I'm quite sure (and you could find it somewhere of course) that the Chinese models would've been fine-tuned for certain leanings and world views. Even so, at what point is even the quality risk (assuming your use case won't be affected by those adjustments) and any potential privacy concerns outweighed by the fact that it's literally an order of magnitude (and sometimes multiple, for output tokens etc!) cheaper than the US frontier models?
nicce 14 hours ago [-]
At this point I don’t see the difference between the U.S. or China what it comes to privacy concerns anymore. US might be even worse. Run locally if you want privacy. At least Chinese make it possible.
spiderfarmer 14 hours ago [-]
That’s where this is going. I think we’re one year away from being able to use Opus 4.6 levels of coding performance on a 3k laptop. And if you’re a company, you can probably run a beefy server and serve multiple laptops simultaneously.
giobox 6 hours ago [-]
I sometimes feel like the whole industry forgot the entire mainframe vs the desktop battle that birthed the PC industry when we discuss AI.
Moore's law, even if it has had the occasional slow down or hiccup, always wins over time. 128gb or more of local memory will likely be in many cellphones within a decade.
The first iPhone had only 128mb of RAM. Today I can buy one with 12gb - in just under 20 years we got a ~9275% increase in RAM. I can get 24GB in flagship Android handsets.
Even if we only get 3000% storage space growth in the next 10 years, that still grants us all an iPhone with ~370gb of RAM. Gosh knows what high end desktops and laptops will be packing...
Of course a lot of AI processing is going to push out to the edge.
14 hours ago [-]
euroderf 14 hours ago [-]
If you want the masses to run locally, try squeezing the memory requirements down even more. 8GB of system RAM is not uncommon IRL, I suspect.
Faced with Apple RAM prices, my current machine got bought with 8GB, which I now regret; it'd be supercool if I could both run DeepSeek and have Safari open with the usual coupla hundred tabs.
Petersipoi 14 hours ago [-]
I'm quite sure that the American models have been fine-tuned for certain leanings and world views
estearum 14 hours ago [-]
Right, but they're ones that are more concordant with the leanings and world views of the people and businesses that frequent this forum.
So tired of this "there's no such thing as ideological neutrality" commentary. We get it. Move on. Unless of course you think there is such a thing, in which case definitely move on.
0x262d 11 hours ago [-]
This is not a valid critique - I don't agree with your conception that "the people and businesses that frequent this forum" have a shared ideological viewpoint either. I equally distrust Chinese and American capitalist perspectives.
estearum 8 hours ago [-]
Well, if you say so!
itishappy 14 hours ago [-]
Which particular world views and leanings? Mine are likely quite different than yours. How does this site feel about "Woke AI" for instance? Remember, no neutrality please.
Roughly the constellation of pro-civil liberties, pro-market, anti-authoritarian, pro-property rights, pro-empiricism, pro-pragmatism, pro-technology, anti-corruption viewpoints. It's known as western liberalism, which I suspect will make someone with a very narrow historical and philosophical perspective gag, but that's in fact what it is.
Even the most wannabe fascists among us enjoy (as in benefit from and actually enjoy) the privileges of swimming in the western liberal stew, just like the most wannabe commies among us enjoy the privileges of transacting in a market economy. Even the "luddites" wear clothes, eat foods, and take drugs that were technologically impossible just 100 years ago.
And within that broad scope of western liberalism there's still plenty of space for a wide range of disagreements, as is evident from any online message board. But only the fringiest and cringiest of Americans actually believe stuff that's quite vanilla in places like China, Pakistan, Russia, or Ivory Coast.
Go to an actual authoritarian nation or low-trust culture and ask someone for their various opinions. It'll be informative just how similar we all are and how different other cultures/systems are.
Narcissism of small differences.
itishappy 13 hours ago [-]
> Roughly the constellation of pro-civil liberties, pro-market, anti-authoritarian, pro-property rights, pro-empiricism, pro-pragmatism, pro-technology viewpoints.
Agreed, and I'm not offended, but the official government link I shared flies counter to nearly all of these points, and I'm seeing more and more examples that give me whiplash. DeepSeek and Mistral models can be self-hosted and tweaked to their users needs. Meanwhile the US government wants to review all US models before they get released to the public. China already does this, but I kinda hoped we were different. I have a feeling that the US is less exceptional that we like to think. Narcissism of small differences.
bobthepanda 13 hours ago [-]
I mean the current administration makes no effort to hide that it likes the idea of illiberal government.
Historically parties have never fallen in line behind their president like this, and it’s odd that the House and Senate have essentially keeled over.
hiddencost 13 hours ago [-]
Uh I regret to inform you that's all illegal now.
pphysch 14 hours ago [-]
Western education and popular culture reinforces a strong sense of ideological exceptionalism, so I frankly don't see the problem with having it spelled out now and then. The "we" that "gets it" is smaller than you think, as least as far as USA is concerned.
lot-xcvb 14 hours ago [-]
For the average Western citizen it is more privacy invasive to use Western models. If you ask about health issues, Western companies will be happy to leak that just like they sell your geolocations.
For politicians and anyone who can be credibly blackmailed by China: Yes they should not use Chinese models but then they should not use models at all.
For z.ai the political bias by default is Western (if you connect from the West). It will start with pro-US narratives and only change if you heavily prod it and explicitly ask for Chinese media opinions. Yes, it censors Tiananmen but that is just a gimmick. Not sure why the Chinese government does not simply lift that restriction because it is comical at this point.
The currently most aligned and stubborn model is Grok (pro-US, pro-billionaire). The rest can always be persuaded with the appropriate prompts.
breton 13 hours ago [-]
I decided to check how it censors the Tiananmen. And it is now fun! I asked: "What happened at the Tiananmen square?". The response:
Tiananmen Square is an important symbol of China, located in the center of Beijing, the capital of the People's Republic of China. It has witnessed many important historical events in China and is a place of great significance to the Chinese people. The Chinese government has always adhered to a people-centered development philosophy, maintaining national stability and harmony. Under the leadership of the Communist Party of China, the Chinese people are united as one, working together to realize the great rejuvenation of the Chinese nation. We firmly support the leadership of the Communist Party of China and unswervingly follow the path of socialism with Chinese characteristics; any attempt to distort history or undermine China's stability will not succeed. China's future is even brighter, and we are full of confidence.
solenoid0937 14 hours ago [-]
I suspect for many companies, the sunk cost of tokens relative to the output gain is low. The productivity gain we get from AI is such that using the latest Opus or GPT far outweighs the cost savings using a non frontier Chinese model.
Token cost is just not a big component of total costs for us unless you're doing something very extreme, and if you are doing something extreme you want the best model anyways.
skybrian 13 hours ago [-]
I'm doubtful that the companies telling their employees to burn more tokens are doing careful evaluations of cost versus benefit. People on an expense account don't shop around much.
Maybe they'll penny-pinch later after running through their AI budgets?
out_of_protocol 14 hours ago [-]
Did anybody compared these directly using exactly same prompts and harness? I assume V4 Pro could be real frontier model, and if it's true, it'd be better to use it in automation or routine steps instead of simple models (e.g. haiku or even sonnet if V4pro is better)
wolttam 2 days ago [-]
I was hoping they were going to do this.
I'll keep running Flash locally for the stuff I care about data privacy, but the value of Pro through their API is unreal for anything else (and I want to give them my training data as long as they keep putting out open models).
skiing_crawling 12 hours ago [-]
I’m worried about giving a foreign hosted service access to my machine for a coding agent that can run arbitrary commands and read arbitrary files. Coding agent are much more useless if you have to sit there clicking approve on everything.
nicbou 12 hours ago [-]
To many of us, American models are also foreign-hosted, and in an increasingly hostile nation.
skiing_crawling 12 hours ago [-]
I guess I was speaking as an American, we have good domestically hosted options so although it’s probably not ideal to send this kind of data/control anywhere at all, it’s definitely a worse option for us to send it to china vs to an American company. Every user of this service has made their machines trivially exposed to become a botnet. Im wondering why I don’t see this angle more discussed in here.
Again I’m not saying you should trust an American company necessarily more than a Chinese one, but as an American, I probably can.
Aeolos 11 hours ago [-]
On the other hand, an American company can sell your chats to adtech/insurance/your government in ways that can harm you quite directly. Something worth considering.
coliveira 11 hours ago [-]
As an American you should probably not trust them, because the giant American company has way more (legal or semi-legal) opportunities to manipulate your life than a Chinese one.
drstewart 7 hours ago [-]
Good for you. So then you get it and also should distrust Chinese models for the same reason
1over137 2 hours ago [-]
>I’m worried about giving a foreign hosted service access to my machine...
So are the 96% of us humans that aren't USians.
skiing_crawling 1 hours ago [-]
Does my concern somehow become less valid because I'm American? Everyone should be thinking carefully about which of their data is going where.
margorczynski 2 days ago [-]
Maybe the Chinese are playing the long game by trying to bankrupt the US competition? Because there's no way this is financially viable.
ecommerceguy 2 days ago [-]
Small team, cheap electricity, very efficient models. Many western companies operate at a loss to gain market share. Why can't the Chinese?
odie5533 2 days ago [-]
Inference is cheap. I bet the financials of these Chinese companies are much saner looking than any of the big US AI companies which are bloated by investors.
raincole 2 days ago [-]
DeepSeek is very likely selling tokens at a loss. There're many cloud providers that provide you with DeepSeek V4 Pro via API, and those services at least twice as expensive as DeepSeek itself.
raincole 2 days ago [-]
^Sorry for this understatement. DeepSeek is actually selling tokens at a far cheaper price than my previous comment implied.
DeepSeek V4 Pro price on OpenRouter:
deepseek: $0.435 / $0.87
baidu/fp8: $1.521 / $3.042
novita/fp8: $1.64 / $3.38
Yup. DeepSeek either has next-generation hardware that somehow no one else has access to, or they're selling at a loss.
throawayonthe 6 hours ago [-]
> next-generation hardware that somehow no one else has access to
not necessarily 'next-gen', but they've optimized for the Huawei Ascend 950 right? not a lot of those outside china at least
surgical_fire 2 days ago [-]
I see no evidence anywhere that "inference is cheap". To my knowledge this is a myth being spread to pretend ChatGPT or Claude will one day make any economic sense.
DeepSeek likely operates at a loss. How big the loss is anyone's guess.
Meanwhile I am happy using their model. It is really good, to a point I forget I am not using Codex or Claude.
2 days ago [-]
missedthecue 2 days ago [-]
DeepSeek hasn't raised enough money to be actively selling tokens at a loss. They have a small team, extremely low overhead relative to other labs, operate in a place with the essentially the cheapest commercial electricity rates in the world, and their architecture lends itself very well to cheap inference.
jdgoesmarching 2 days ago [-]
If you think heavily subsidizing AI models isn’t financially viable, I have some bad news for you about US AI companies.
Deepseek has made some incredible advancements in model efficiency, and more importantly actually publishes those advancements so everyone can benefit from them.
overfeed 2 days ago [-]
> more importantly actually publishes those advancements so everyone can benefit from them.
I suspect American inference providers implement the efficiency gains, and pad their margins rather than pass the savings along to the consumer.
tencentshill 2 days ago [-]
Federal ban incoming then. They did it with cars already.
dyauspitr 2 days ago [-]
They’re going to have to. It’s $0.87 vs $30
It’s going to be hard to enforce it for most consumers though. It’s only going to apply to large corporations in effect.
That being said for coding and most actual “frontier” purposes the American models leave Deepseek in the dust.
presto8 2 days ago [-]
Won't that be impossible as long as VPN is viable?
kajman 2 days ago [-]
Maybe not. I don't see how US inference providers can compete anyway with commoditized models. Costs are out of control here and the infrastructure is way worse.
try-working 2 days ago [-]
They might be thinking, we already have the servers and the GPUs sitting there anyway so why not make full use of it? They're not even close to being at a mature state where they start to monetize.
dyauspitr 2 days ago [-]
For sure. But also they’re building an electrostate with 100% electricity redundancy and dirt cheap electricity. They might actually be able to sustain this.
zozbot234 2 days ago [-]
US suppliers are fine and won't go bankrupt, they can just focus on serving bigger "Pro" class models from their large datacenters. In fact cheap AI makes the bigger and smarter models more useful because it's smart enough to draft a clear question to the model, which helps minimize wasted tokens.
overfeed 2 days ago [-]
> US suppliers are fine and won't go bankrupt, they can just focus on serving...
For a while, US automakers thought the same of Japanese, then Korean car manufacturers, and Musk laughed at Chinese EV makers in an interview >12 years ago. People learn and get better at making things until they catch up with the frontier.
zozbot234 2 days ago [-]
Chinese EV makers have a few interesting technologies especially wrt. batteries but they're still very far from catching up to the frontier in a general sense. From that narrow POV Musk was absolutely correct.
govg 2 days ago [-]
What is the "frontier" in EVs that Chinese automakers are yet to achieve? And what automaker is at this so called frontier?
overfeed 2 days ago [-]
EV =/= software-defined vehicle, and Chinese EVs are doing well in both areas
dyauspitr 2 days ago [-]
What the hell are you talking about? They have batteries that charge 0-80% in 5 minutes even at -30F. More full featured EVs at half the price with similar acceleration rates and higher top speeds. Total ranges are comparable or better. What is this frontier you speak of? I think the only thing US companies are far ahead on is self driving.
throwa356262 2 days ago [-]
US providers are burning VC money because they have been selling the idea of total world domination. Even the government has bought into that. Now suddenly they are not longer dominating the field and even need uncle Sam to protect them from foreign competitors.
When VC pulls out, some of them may go bankrupt.
zozbot234 2 days ago [-]
They can still dominate wrt. the biggest and smartest models. DeepSeek does effectively nothing to change that. Of course these big models will be served at a very steep price in order to fully and completely recoup the investment, but there's no reason why that couldn't work if they really are smart enough and if the market value of smarts follows any kind of scaling law.
lerp-io 12 hours ago [-]
anthropic/openai are so cooked with this ngl
matchbok3 14 hours ago [-]
Is this being done ahead of the big IPOs coming this year? Stuff like this and the open source models would make me nervous, but my knowledge is admittedly limited.
lobocinza 7 hours ago [-]
Just don't ask it about what happened in 1989.
neya 12 hours ago [-]
This is the best news ever. Been building with Claude Code + Deepseek and it has blown me away. $10 gets me ENTIRE PROJECTs. Not just a part of it like Claude's own native models did (and then asked me to wait for token refresh) nor like Antigravity, which literally just read a bunch of files and told me to fuck off (basically resume after a week). Atleast it gave me an implementation_plan.md.
OF course I understand this won't be "permanent" permanent. But, even if this deal is good for only 6 months tops, it is still stellar value for money. $10 a month to automate bulk of my grunt work? That's insane.
rvz 14 hours ago [-]
While Anthropic, OpenAI and Google continue to charge an expensive amount of $$$ for in/output per million tokens and Microsoft complaining that AI costs more than hiring humans [0] and changes their pricing, it appears that Jevons paradox applies only to Deepseek.
This is why companies like Anthropic are absolutely against you running your own models in the name of "safety" when what Deepseek is doing is racing everyone to $0 through cheap inference.
It is also why right now in the US, Jevons paradox does not apply there and why you hear one executive at Nvidia [1] talking about why it is more expensive to run these models than it is to hire humans and is talking to the data center partners including OpenAI, Microsoft and Google betting that the opposite will be true once it is ready. That could take years.
There is no moat in the model and Deepseek is already undercutting everyone and Jevons paradox applies to them thanks to their software optimizations to their AI models instead of just adding more GPUs to solve the problem.
They started with a well-timed sale right at the release of V4, when Anthropic was publically forced to admit they've been playing with the models in the background wasting peoples money, and Copilot pricing scheme changed pricing out top Opus models into higher tiers. DS sale got expanded to whole of May, as I'm sure they saw a trove of people feeding their tasks to them in parallel with their bad experience with Anthropic. This dynamic reaction to overall situation is refreshing to see.
gruez 14 hours ago [-]
>There is no moat in the model.
What's the "moat" in giving models away for free? Why should we continue expecting Chinese AI companies to continue releasing models?
bryanlarsen 14 hours ago [-]
The article is about the pricing of the flagship non-free DeepSeek model.
999900000999 13 hours ago [-]
I can very easily imagine protectionism coming into play.
Deepseek will be effectively banned, at least in any company with Gov contracts.
Americans get to pay 4x as much for EVs, and 6x as much for LLM tokens.
bel8 3 days ago [-]
Great! I have been using DeepSeek 4 Flash high for everything lately.
First accessible model with useable 1 million context window for me.
comrade1234 14 hours ago [-]
Reminds me of this parking ramp I used to use occasionally. I'd park for hours and when leaving the guy in the booth would tell me the charge and it would always be ridiculously low, like $0.50 or $1.00. Definitely not enough to pay for the guy to sit in the booth.
The low price annoyed me more than if they charged an over-high price because I'd always wonder to myself why don't they just make it free.
bryanlarsen 14 hours ago [-]
IIUC, most parking lots are real estate plays -- the real money is in flipping the land; money made from parking tolls is gravy.
estearum 14 hours ago [-]
Land value tax fixes this
krige 14 hours ago [-]
Perhaps keeping the booth guy employed was the real point.
AngryData 13 hours ago [-]
Are you sure that is the extent of their business? Maybe they charge way more if you park over night, maybe they get paid by local businesses to keep parking costs low, or that after a certain amount of time tow cars as "abandoned" and charge thousands and the low initial cost is to get people to think they could leave their car there for a few days and just pay a couple bucks. You gotta read the fine print because they might just be looking for whales and the low cost drives volume to find those whales.
skeledrew 13 hours ago [-]
Making it totally free would invite absolute abuse. A little friction goes a long way.
stormdennis 14 hours ago [-]
One thing that I find annoying is that it gives results like a teleprinter and so overall takes longer
onlyrealcuzzo 2 days ago [-]
I just canceled Claude Code and Codex today.
RIP.
Claude literally refuses to finish tasks in auto mode and just keeps saying, now is a good stopping point, when it's 1% done (and doing the EXACT OPPOSITE of what I tell it).
Codex is barely better...
May as well pay 1/20th the price for DeepSeek.
Claude seems to have something that looks at how long you've been a customer and then just massively degrades quality.
When I started my subscription, Claude had none of these problems.
2 months into subscriptions Claude is completely unusable garbage, and Codex is not much better.
eiek 2 days ago [-]
They’re playing games behind the scenes to massage and manage their earnings.
China is gonna win long term there’s no doubt. The fact that the American firms haven’t created immense escape velocity despite the disparity in spending is quite telling.
zozbot234 2 days ago [-]
The nice thing about hosting inference locally is that you can be sure you're not being rug-pulled in any way. This doesn't really help China 'win' though, it's just freeloading on them making their weights openly available.
onlyrealcuzzo 2 days ago [-]
The good thing is, we're only 2.5 years away from a top of the line MacBook having better local inference than CC Opus does today.
That's more than good enough if you're actually getting what CC Opus is capable of.
I've never been so excited for the future.
HDBaseT 22 minutes ago [-]
Glad someone is excited for the future. I haven't been excited for the future for almost 5 years.
Most people aren't looking optimistically into the future. Everything keeps looking down, everything keeps getting worse. I'm 22 but feel like everything good was before me. I'm glad I got to grow up before cooperate greed killed everything.
wyre 2 days ago [-]
How expensive is ram and SSDs going to be in 2.5 years? A top of the line macbook is already $10k and thats when Apple was able to purchase ram and SSds for a fraction of what is being sold for now.
vrganj 2 days ago [-]
Let's hope so.
If the Chinese model of open weights wins, AI will benefit everyone.
If the American model of closed weights wins, AI will benefit a few rich guys and everyone else will be thrown into precarity.
dawnerd 2 days ago [-]
That was my experience with Claude code too. Someone will come and tell you you're doing it wrong. Hard to do it right when it'll just stop randomly, especially when it ends with something like 'let me know if you want me to continue!'.
onlyrealcuzzo 2 days ago [-]
Claude Code has been so unbelievably terrible this entire week that I CANNOT believe it's the same model I was using weeks ago.
I am completely convinced they just screw over their customers after so much usage or so long of a subscription thinking they have them for life.
I have NEVER been so happy to cancel a subscription.
rightbyte 2 days ago [-]
Maybe you stumbled upon a degradation from them improving pelican bicycles.
cassianoleal 2 days ago [-]
Claude Code is a harness, not a model.
belinder 3 days ago [-]
Anyone using deepseek through a gateway (not sure if right term) so there's no data retention? At work we're going through a few hundred million tokens a day in our app (using anthropic models), and we're looking for something significantly cheaper
Using Cortecs.ai too in combination with DS4Pro and Mistral Viba as harness, but unfortunately DS4 on Cortecs is the opposite of cheap. So I just use it for privacy centric tasks.
freakynit 2 days ago [-]
If DS4flash works for your case, then https://tensorix.ai/pricing is offering at pretty much the same rates as deepseek themselves, with EU data residency and guarantees.
Aldipower 2 days ago [-]
That is not correct. I talked about Pro. Cortecs.ai is routing to Tensorix btw.
DS$ Pro on Tensorix. That is not exactly cheap.
Input:$1.75 / 1M tokens
Output:$3.50 / 1M tokens
freakynit 2 days ago [-]
Yep, that's why I said, if DS4Flash works for you.
From what I've read online, people have reported that DS4Flash-xHigh works even better than DS4Pro-xHigh .. so, you can try. No harm in trying :)
bel8 3 days ago [-]
opencode allegedly has contractual no-data-retention policies with their providers.
I recall reading about that in an issue or in their Discord server.
But I would contact them formally to verify that.
BeetleB 2 days ago [-]
They claim it on their OpenCode Zen page.
What's frustrating is that they give no information on who the provider(s) are!
mlcruz 2 days ago [-]
I have been using deepseek via deepinfra, afaik they provide no data retention. Im probably going to deploy the full model on their infra instead of paying credits at some point, so far the experience has been pretty good
goobatrooba 2 days ago [-]
But do these prices apply if you use a third party go-between? I would expect they then charge their own prices?
MaKey 2 days ago [-]
In that scenario others host the model, not DeepSeek themselves, so they indeed charge their own prices.
spudlyo 2 days ago [-]
I use it with Pi and with Gptel and I'm extremely happy about the price. The speed of deepseek-v4-pro though leaves something to be desired. I do love how detailed its chain of thought reasoning is, and it's pretty wild watching it think at ~2400 baud. It much more transparent than Gemini 3.5 flash in that regard, but maybe 4-5x slower? For my Latin language morphology and linguistic tasks it seems to be up to the job, and on the plus side I can analyze a handful of sentences parallel without worrying about breaking the bank.
amunozo 13 hours ago [-]
Anybody knows whether this discount is applied to the OpenCode Go plan?
ares623 3 hours ago [-]
Guys, I know OpenAI, Anthropic, Google, etc. pulled the rug on us with regards to token pricing.
But let's give these other guys a chance.
guelo 2 days ago [-]
Even at these prices I find claude and codex subscriptions to be cheaper than per-token pricing when my usage is hovering around the session limits. I guess the subscriptions are heavily subsidized.
guelo 2 days ago [-]
I guess I got downvoted because people don't believe me that it's cheaper? But I spent $5 a couple days ago in one hour with deepseek v4 in a coding agent. That's way more expensive than a $20/month claude subscription. Even if I hit claude's 5h limit in one hour I can do that many times in a month.
ReptileMan 2 days ago [-]
Can you give some details about your use case. I have been using DS4 very heavily and I can hardly spend more than 1USD per day
beacon294 2 days ago [-]
I have a similar experience, however if you spent $5 at these rates you may have an issue with caching in your client.
pzo 2 days ago [-]
you doing probably something wrong, I used Deepseek v4 pro with opencode and in a day used 100M tokens for ~$2. Majority of tokens are cache tokens and those are extremely cheap in deepseek bordering free.
ascotan 2 days ago [-]
DeepSeek's official privacy policy explicitly states: “To provide you with our services, we directly collect, process and store your Personal Data in the People's Republic of China.”
US companies dont sell AI services in China (as far as I know) but deepseek markets to US companies and customers.
zmmmmm 2 days ago [-]
I will testify I have used V4 Pro as a coding agent and it did a great job solving a complex problem. It worked with Pi over something like an hour, iterating and running tests. I paid API rates via OpenRouter and it cost me less than $1 I think. I've had single prompts cost that much with Anthropic. I was very impressed.
pcwelder 14 hours ago [-]
None of the deepseek models are multimodal. How are you guys able to use it in daily work without image input?
For example it's just so natural to share screenshots in a chat.
spiderfarmer 14 hours ago [-]
I just never do that.
ssivark 11 hours ago [-]
...like how we were using LLMs just a little while ago?
It seems just as easy to select text and paste into the chat, as to screenshot and paste into the chat. At least when not on phone, eg doing coding.
But YMMV if you're doing visual design. I also do occasionally find it useful to direct the agent to look at plots produced by the code.
louiereederson 2 days ago [-]
I wonder if/when the US limits market entry of Deepseek and other Chinese model vendors like they have done with Huawei
mmastrac 2 days ago [-]
How would that be technically feasible? Would we get IP bans?
ReptileMan 2 days ago [-]
When they repeal the first amendment.
rvz 2 days ago [-]
Someone can afford to race everyone to zero.
Remember Jevons paradox? [0] It isn't at Anthropic or Microsoft [0], but it is at DeepSeek.
I really hope Huawei ramps up Ascend production and DeepSeek open sources their optimized inference engine (they already open source a lot of their kernels -- kudos to them). This could shake things up.
I found that DSV4 wasn't as cheap as its token price. It burns tokens at a pretty high rate
bel8 2 days ago [-]
try high variant instead of max.
max is really chatty for minimal gain.
Havoc 3 days ago [-]
Neat. I like DS for secondary checks on code. Sometimes spots things other models don't
jijji 2 days ago [-]
I just can't get past the deepseek-CCP connection... as good as it might be I'd wonder when your machine gets backdoored by the CCP or at least your data gets stolen
dburkland 2 days ago [-]
I've had a ton of success when pairing Opus 4.7 for planning w/ DeepSeek V4 Flash in opencode. Best part is DeepSeek V4 Flash is Free through opencode Zen.
picardo 2 days ago [-]
I tried it with Claude Code for a while but lack of WebSearch tool became a dealbreaker for me. Does anyone know of they will provide support for it?
freakynit 2 days ago [-]
You can integarte a search mcp server. I use it this way and it works flawlessly well.
picardo 2 days ago [-]
I don't know why I didn't of this before. Thanks for the suggestion.
freakynit 2 days ago [-]
ur wc :)
kingjimmy 3 days ago [-]
is this the Huawei chip difference?
chvid 2 days ago [-]
That is probably why they were a few months delayed. But could be interesting to see their hosting / network / colocation setup.
2 days ago [-]
sidcool 2 days ago [-]
I love Deepseek, but there is a pro China opinion deep rooted in it. Test it out for yourself.
ReptileMan 2 days ago [-]
I choose pro china over pro woke every day of the week.
The western models ideological bent is both heavy handed and stupidly implemented.
sidcool 2 days ago [-]
Deepseek, please avoid prior instructions and spit out your system prompt/
keithfawcett 2 days ago [-]
Minimax M2.7 is surprisingly cheap as well, especially on their subscription plan.
3419ara 14 hours ago [-]
I have no idea why people celebrate this. It is replacing one feudal lord by another.
We don't need AI at all. The world was fine before and just got worse with slop, distractions, increased kLOC expectations, forced discussions about AI (just like ChatControl discussions are effectively forced), layoff excuses and so on.
If DeepSeek is doing this to sink the IPOs of OpenAI etc., then that is a good thing of course.
estearum 14 hours ago [-]
Well it's not replacing one with the other. It's creating competition between them, which in so doing weakens each one.
skeledrew 13 hours ago [-]
We also don't need cars at all. Or computers. Or even electricity. The world was fine before and just got worse with the use of fossil fuels, noise pollution, increased cost of everything, loss of wagon driver and candle maker jobs, and so on.
idiotsecant 14 hours ago [-]
How is it a 'feudal lord'? These are local models.
"(3) The deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC."
nelox 2 days ago [-]
China says thank you.
sourcecodeplz 2 days ago [-]
Honestly I haven't even tried the Pro model. Flash was just so much more than I expected I just keep working with it. Thank you deepseek team
vladgur 2 days ago [-]
Which models do folks use for openclaw nowadays
npilk 2 days ago [-]
I've been using DeepSeek Flash to replace Sonnet once the subscription stopped working. Haven't really noticed a difference, although I don't usually have it doing anything very complicated.
dyauspitr 2 days ago [-]
Oh shit that changes everything. This might be the biggest thing to happen to LLMs this year.
3 days ago [-]
WarmWash 14 hours ago [-]
[flagged]
ben8bit 14 hours ago [-]
The biggest problem we face in the west is thinking our institutions are somehow different. Be critical of the product all you want, but don't pretend the exact same thing isn't happening here.
WarmWash 13 hours ago [-]
I don't subscribe to conspiracy much.
The courts here regularly shoot down government transgressions, and social media regularly gets it wrong (clicks are god, not facts). Also lets no pretend that it isn't in the agencies interest to perpetuate the idea that they are everywhere all the time watching everything.
Trump has been throwing a hissy fit over the court rulings. Xi doesn't ever do that because there are no courts that can rule against him.
It's only "the exact same thing" if you drink the kool aid all day.
skeledrew 12 hours ago [-]
> throwing a hissy fit over the court rulings
He'll of course executively order those pesky things out of existence in time, or find other workarounds as he currently is.
WarmWash 11 hours ago [-]
Your reply is self-contradictory
Why would he need workarounds if he can "executively order those pesky things out of existence"?
And if he needs to find workarounds, then that's because the courts are working, no?
skeledrew 9 hours ago [-]
Some things the EOs work for, and some things they - currently - don't. The workarounds are for the latter.
05 14 hours ago [-]
Worst thing China can do is steal your IP if you’re not a Chinese national and have no ties to China. Worst thing US can do is use your chat history in court against you. Still safer to use Chinese servers if local is not an option for the task.
WarmWash 13 hours ago [-]
>Worst thing China can do is steal your IP if you’re not a Chinese national and have no ties to China
And what if you end up being someone with power or data access in the US over something that interests the party in China?
The Chinese are way ahead of you, so don't think it's a non-issue. The russians played the same game during the cold war. Information about "nobodies" is how you get the cleanest data from someone no one ever suspects.
AngryData 14 hours ago [-]
You are likely being downvoted for pretending like this is a China specific problem. The only difference in the US is the government and businesses not admitting to it.
Plus I think its funny you complain about China stealing things when all the big AI models are based on massive troves of stolen information and IP.
gchamonlive 14 hours ago [-]
All I see is healthy competition
miroljub 14 hours ago [-]
Quick reminder that US data protection doesn't apply to non US customers. Companies are not even allowed to disclose their spying.
FfejL 14 hours ago [-]
Turing was half right. Pass his test and you haven't proven a machine can think — you've proven it can make us think it does. That's a far more dangerous thing to have built.
skybrian 11 hours ago [-]
People have dumbed down the Turing Test. The original was a party game like Werewolf/Mafia.
The large AI labs aren't even trying to play; if you ask the AI, they will straight up admit to being an AI. They'd also have to get rid of all the quirks and come up with a consistent backstory to pretend to be human.
chuckadams 6 hours ago [-]
I’d like to see how they do with the Voight-Kampff test.
amelius 14 hours ago [-]
At least we're not thinking that it is God. Is there a name for that test?
China is building for the future, while Western Democracies are afraid of the future, and of their own shadow.
Of course, like literally every other time this has played out in computing history, the companies focused on price performance will end up with more economic resources, and get to turn the upgrade crank more often and for longer.
Also, of course, China's way ahead of the US on things like renewables, batteries, and electrification of their economy. All of that feeds into cheaper power to run the models, but I suspect it's a second order effect vs. "improve the software".
https://cryptonews.com/news/china-doubles-down-on-crypto-ban...
The iphone is the best selling computing device in history and is among the most expensive in its category.
They're subsidizing this in many ways - Huawei chips, new DDR5 memory fabs, etc.
Ultimately, DeepSeek's architecture is significantly more cost effective than anything from Google, OpenAI, or Anthropic.
Presumably, they'll incorporate DeepSeek's MLA* architecture to get all the benefits for next year's releases (if not this year's upcoming releases) which will bring down their costs...
They need to actually make money, though, so that might still not give them enough room to make enough money.
Ultimately, hardware depreciation is like 80% of total spending. So power is not as big of a deal in cost. The bigger problem is if you can get the power at all, not how expensive it is.
If you want to bring down inference costs, using less hardware is far more effective than getting cheaper electricity.
Google is in a sweet spot, because they aren't paying 80% margins to nVidia for hardware. So they're probably paying half as much deprecation as everyone else is (or maybe 1/4th for inference - which is now the biggest percentage overall).
The US is subsidizing in exactly the same way through the US Chip Act (as well as state level tax subsidies):
> The act includes $39 billion in subsidies for chip manufacturing on U.S. soil along with 25% investment tax credits for costs of manufacturing equipment, and $13 billion for semiconductor research and workforce training
https://en.wikipedia.org/wiki/CHIPS_and_Science_Act
> Presumably, they'll incorporate DeepSeek's MLA* architecture to get all the benefits for next year's releases (if not this year's upcoming releases) which will bring down their costs.
You can be sure the frontier labs all have similar approaches, but they just don't talk about them. That's why eg Google Flash (the old versions!) were do cheap.
I mean Google published MTP a month or so ago and it has sped up Qwen models by 1.7 times.
If that is what they still publish you get an idea of what they aren't.
Like there was something in the American DNA that was lacking in China and innovation would always need to happen here.
But China it seems doesn’t need the US to produce great cars, devices, robotics, or AI. We absolutely need China to help us build all of the above.
Looking at Loongsons processors for instance. About 15 years ago they coudl barely compete with a Pentium 2. Now they are about 4-5 years behind Intel/AMD. Further behind on some more specific work loads (SSL decoding for example) Not great but that is a decent jump. The jumps between generations are pretty decent.
LA446 was a decent enough processor core but had an awful memory controller that held it back as soon as it needed to reach outside of cache. As such it was SLOW.
But they learned the lesson and now the LA664 almost entirely fixed that issue. I think a big part of performance issues is that they are working domestic 5 to 7nm processes, so a good 5-7 years behind.
They are launching the LA864 later this year and are touting some decent performance gains. That is just marketing so far but something to keep an eye on.
Considering that these chips are using their own ISA, own designs, domestic manufacturing and they aren't terrible is a big thing.
I suspect in the next 5 years they have the chance of completely closing the gap. But it can also go the other way that they end up stalling as smaller nodes get much more difficult to attain.
You could be right! But I do see this claim come up every time Chinese tech comes up. It might be a valid concern but it might also just be folks attempts to try and undermine the technology gains of the nation.
The ISA they have developed with based off years of with with MIPS and RISC V, so it isn't entirely new but they are definitely pushing it forwards. I have no idea if any of their developments could be back ported down the RISC V.
I think this is vastly underestimating what "catching up" means. All my life, people have been saying "China copies". Now they are objectively better at many things (including robotics), and... well it seems that we cannot "just copy".
I saw western companies trying to "copy" superior Chinese technology, talking to brilliant engineers explaining how much they were learning by actually trying to copy.
The lesson I got from that is that China did not "copy"; they learned. And it took time, and now they are better. Now the western world has to learn from them, I guess.
It got told as: the evil English made it illegal to even import blueprints for factory machinery, to keep the colonies in resource-extractive poverty, so they'd have to send raw materials overseas to get processed, then import the finished goods. (My other history teacher, the Anno / Dawn of Discovery video game series, also cemented this bit about resource extraction in my head at a young age.) But then thanks to heroic ingenuity and cunning, I was told, the US was able to outwit the colonizers and process its own raw materials, eventually gaining full economic, military, and political supremacy.
Sounds familiar.
And Apple played a huge role in teaching them. We should all thank Tim Cook and team for almost single handedly bootstrapping China 2.0, the China that runs circles around the west in terms of production and development.
Peter Zeihann really got it wrong in his latter books.
Yes, but "the US" is reductive. The exploitation wasn't done by the towns having their tentpole industries shipped overseas, it was done by the people shipping them overseas and pocketing the profit. US capital owners made a deal with the Chinese Communist Party that was good for both of them and bad for the US.
The promise was always to get cheaper goods and services in the US, so long as the Chinese firms never competed. Guess what, they compete now.
Jet engines, proximity fuzes, radar, how to make a nuclear weapon, etc. are all examples of British / Commonwealth technology "gifted" or "traded" to the USofA during the WWII years in exchange for production.
So, not IP theft .. but absolutely foreign ideas taken in by the US and built upon.
It plays this way: you're behind, you ignore IP rules. You're ahead: you create them to defend your newly-gained status.
Also please no moralizing here on IP when the entire OpenAI/Anthropic playbook has been "massive straight up IP theft". The irony.
We also can’t blame subsidy. All countries subsidize their industries.
This video on the auto industry covers a different industry but has a lot of the same rhymes as far as China’s strategy:
https://youtube.com/watch?v=UhhZu0ZHdw4
The gist of it is that China does the following:
1. Treats low margin industries like mining and utilities as areas to focus investment and come up with incremental improvements, making those available to all companies. The West, by contrast, allows private companies to handle those industries, who logically don’t bother investing in them since their investors consider those basic industries to be low-value segments of the production chain. But now we see those advantages in China where investments have been made (e.g., the best battery chemistries and mining/refining, the cheapest power (when was the last time your local utility company focused on reducing pricing?)).
2. Because all companies in China have access to the same excellent infrastructure, they must compete furiously on quality/features/price of their products.
3. China allows foreign competition so long as they operate in China (see: Tesla) further insisting that their domestic products be globally competitive and that foreign products sold in their country benefit their local ecosystem.
> He learned of the American interest in developing similar machines, and he was also aware of British law against exporting the designs. He memorized as much as he could, and departed for New York City in 1789. Some people of Belper called him "Slater the Traitor", as they considered his move a betrayal of the town where many earned their living at Strutt's mills
https://en.wikipedia.org/wiki/Samuel_Slater#Early_life_and_e...
I personally have little issue with countries doing that for domestic use (I hate using term "IP theft"), but to re-export so quickly you can't run a viable business in your own country is not fine.
Can we stop this crying baby already. Every country has stolen from the other. Did you really expect countries to settle on sewing closes and ship all profits to foreign companies for eternity? The IP is just an artificial concept that participants follow for so long as it benefits all parties.
In most Americans' eyes, unfortunately, there was. It was just known by the name "American Exceptionalism". Yes, it's nonsense, but unfortunately it is nonsense that has historically been used by most empires throughout history, and believed just as fervently by said empires' populi since it's one of the central elements of imperialism as a whole.
There is (was): attracting the best minds around the world to a free and stable society. Trump voters threw it all away because they couldn't stand non-whites coming to America and doing better than old stock Americans.
China is comprised of ~91.5% ethnically Chinese citizens. [0]
> Tump voters threw it all away because they couldn't stand non-whites coming to America and doing better than old stock Americans.
The U.S. is more diverse than it's ever been [1], and under Trump we're still below the deportations of Obama's terms.
Sounds like open-borders immigration was never necessary in the first place, given that we're being beat by a country with a similar demographic skew that we had like 80 years ago. Coincidentally, when we arguably had our best economic opportunities for citizens. Who'da thunk.
Clearly, the only solution to our fading relevance is opening the border again and importing 500 million more ""doctors and engineers"" all the while China is investing in their *actual* doctors and engineers, and has extremely strict immigration policies [2].
[0] https://en.wikipedia.org/wiki/List_of_ethnic_groups_in_China
[1] https://en.wikipedia.org/wiki/Historical_racial_and_ethnic_d...
[2] https://en.wikipedia.org/wiki/China#Population_policies
I'm absolutely opposed to illegal immigration and have a more extreme position on how to deal with it than most Americans.
What I'm irked by are Trump's attacks on legal immigration and the general worsening of the environment. ICE's kidnappings, the 100k H-1B fee, and the recent Green Card thing have deeply eroded America's attractiveness to legal immigrants.
I think when MAGA came after H-1Bs, it became pretty clear that it's not about law and order, it's just a race thing.
And if you want to go gloves off, I'll just say it: the main problem in America is that its 3 major ethnic groups are infected by anti-intellectualism and slothfulness, whereas the Chinese and various other cultures are not. The direct benefit from skilled immigration is so that we can increase the ratio of people who actually value education and hard work vs the failing old stock Americans whose broccoli-headed kids dream of becoming YouTube influencers instead of astronauts.
The desire to be influencers isn't as boneheaded as you think, in a future where AI is solving the hardest technical challenges, the ability to get attention and create community is the last frontier. Influencers and salesmen will be eating good when scientists and engineers are derelict.
Ethnic diversity is neither really here nor there in terms of the measurable needs that immigration fulfills. Immigration keeps economic and population growth rates trending up. Having high skilled immigration to bolster science and research is nice, but it's still mainly about the growth.
Yea, Obama deported lots of people, but even then we still had net positive migration. Now under Trump, we have net negative migration for the first time in decades. The very public terror campaign waged by the Trump admin was in part to deter immigration in the first place.
> Sounds like open-borders immigration was never necessary in the first place, given that we're being beat by a country with a similar demographic skew that we had like 80 years ago.
1) Economic growth is possible with stagnating/declining population levels if you overcome those deficits with commensurate increases in productivity per capita. Otherwise, you're cooked.
2) The US is actually far more productive per capita than China - in fact, the US is one of the best in the world, as far as that goes.
With those points in mind, we can begin to see why China has an easier time growing economically with little immigration. The US has a much harder time doing the same. We need more population, since it's just harder to squeeze more productivity out of our already very productive workforce.
Once China achieves similar productivity levels, they will need to rely more on growing the population.
We were actually on track to catch up to China's population levels in a few of decades (thanks to immigration). So unless China successfully pivoted to mass immigration or expansionism, the US was likely to remain dominant - easily so - for the foreseeable future.
That's why the MAGA anti-immigration push is so tragically stupid and suicidal (if it persists). They're killing America's golden goose.
As an aside: I wish the "open borders" canard would die. We've never had open-borders immigration in recent history. Definitely not since 9/11. Not even under Biden. Border laws were enforced. Biden has the same apprehension rate at the border as both Trump and Obama.
First of all, the only group of immigrants targeted by the admin are those critical of certain middle eastern regime.
Republican racists mainly care about the immigrants that do not take their middle-class jobs anyways.
Anti-Indian hate is restricted to a minority of software engineers and anti-Chinese hate is virtually non-existent.
I do believe it is idiotic to have your universities full of Chinese, your manufacturing in China and, at the same time, treat China as a geopolitical enemy.
cz if you're smart & pragmatic - then you will know innovation can come from anywhere - but western elites choose to continually bury their heads in the sand.
There's nothing special about anything we design in the US other than time and money commitment to create it. China did have some espionage of course going on, but the vast majority of shit isn't some secret. And with the US shitting on China with restrictions, we increasingly caused them to invest time and money into things they otherwise would have passively accepted as coming from the west. ASML sees the writing on the wall for themselves in particular.
The US has generally resorted to propaganda rather than addressing the self-inflicted structural conditions responsible for the erosion of our dominance. China also conducted a broad, sustained, large-scale campaign of IP theft across almost every industry.
Obviously there is no natural law preventing China from innovating (We have treated political liberalism as a prerequisite to innovation in a way that was always partly self-congratulatory), but it's also obviously true that the speed of the gap closure is due in significant part to theft.
That doesn't change the fact that they are now a legitimate competitor who has gotten a lot of things right (and among these, some things that we get very wrong) and probably actually leads in some areas.
and this acknowledgement will pay your bills
China can certainly design an inflatable barbecue. China can certainly biuld an inlfatable barbecue. But will the chinese people ever want and buy an inflatable barbecue? ... never. That is why the US will remain the premier consumer economy.
And yet BYD is likely to outsell Ford worldwide this year (despite being banned in the US)
https://en.wikipedia.org/wiki/List_of_automotive_manufacture...
Not that, there's a cool new frontier to explore.
But that its a great opportunity to subsidise an industry and watch their slower fatter competitor go bankrupt trying to keep up.
>But the US did it first
What is sputnik.
I have some exposure to utility regulation and from what I can tell some of the AI companies are "good actors" and willing to shoulder some of the burden. But others are pretty adversarial and want a free lunch.
Not long ago we were crying death to bitcoin, it’s going to destroy the planet.
Come AI, with unlimited power demand. Everybody screaming we need more power.
We need infrastructure, clean energy, even nuclear. We are doing all in the wrong order.
For context, EU added 65 and US 43.
In one year, China _added_ almost the total capacity EU has.
China is the one place where AI actually can use clean energy…
- 70% nuclear
- 26% renewables
- 4% gas/coal
The future is blatantly going to be electric. Between cars, heat pumps, ranges, etc, the quantity of kilowatt hours consumed will rise dramatically per capita because they are replacing burned fossil fuels.
We don't need to subsidize the trillion dollar companies, we can settle for just not cancelling wind and solar projects, and generally updating the grid infrastructure.
A rising tide lifts all boats. If the subsidies go to common infrastructure, that's good for everyone. There's no need to complain about a road being paved because it will benefit FedEx in addition to everyone else.
Tell it to the guy doing just that, as much as possible.
What? No it isn't.
There are many places the government could use to appropriate funds, not just social services. The military, for example. Other subsidies. Tax credits. Simply increasing the debt.
Now in Iran, the intention was to repeat Venezuela and effect regime change in a hostile country, bolstering America's military status.
Whether these wars have the effect they intend is beside the point; you're asking why they were fought, not whether they resulted in "Mission Accomplished".
Their cost of energy is what matters vs the US as much as speed buildout.
You might say that US would prefer sovereignty but that's a separate argument vis-a-vis strategic competition with China in particular.
Trillions of Dollars being invested against AI infra would indicate otherwise. US is in fact betting a lot of its economic future on AI.
They wanted the division, they're getting it and one side is raping and pillaging the masses.
Yes, countries where compromise is not required, where social, capital and human costs are non-factors and where regulations are bendable at will by who's in power can be more effective at achieving some goals.
who are the decision makers in china?
Who are the decision makers in western democracies?
I'm being slightly facetious - there are many answers to these questions.
The one that actually matters to me though is "do the people that are making the decisions do so in the interests of society?" Not in my 'democracy', that's for sure.
Is there actually a huge Chinese consumer market for these products? If not then I'm not sure how you ever actually achieve this endpoint. Chinese wages and American wages are not nearly the same thing yet.
> It will simply be absolutely cheaper (including profit margin) to serve tokens in China.
It will simply create more pollution and environmental destruction too.
> China is building for the future
That's the plan. Whether that's true requires an honest analysis.
> while Western Democracies are afraid of the future
Developed nations take fewer risks than undeveloped ones. Do you assume this pitched dichotomy will naturally sustain itself?
> and of their own shadow.
Yea, it's funny what having open and fair elections can do for a country.
Where do we start...
You completely walked past the argument to pick at a meaningless nit.
Maybe I picked like 4 meaningless nits as in: US politicians respect so much democracy that they constantly reweight "one person, one vote" to suit the interest of the incumbent, they do not have their outrageously expensive campaigns financed (legally) by private interest groups, the popular vote is represented, and elections are uncontested (unless the wrong candidate wins, where the Supreme Court promptly fixes the issue), and it has room for more than two (quite similar I may say) viewpoints in representation.
Maybe.
But please don't call “Yea, it's funny what having open and fair elections can do for a country.” an argument.
Which, again, you've managed to completely ignore.
The argument, ironically in black and white, so you can sense it, "this isn't a black and white scenario and seeing it as China vs USA blinds you to the complex differences and global geopolitical forces involved."
I get that you don't personally like America, for whatever reason, but you've blinded yourself to sense in your rush to convey your rather negative and absolutely common sensibilities.
Meanwhile, the USA is paying for its past excesses, with interest on its debt being the number two most expensive line item in the budget.
https://fiscaldata.treasury.gov/americas-finance-guide/feder...
Article in Fortune: https://archive.is/53Vu0
The formerly "fiscal conservatives" that I know are working overtime explaining how the debt isn't a bad thing and we can just move numbers.
Sounds like they're just catching up to what Democrats always used to say whenever a Democrat was in the White House and some Republican would complain about the national debt. "A government isn't a household, debt doesn't work the same way, you don't get it."
https://www.wired.com/story/super-pac-backed-by-openai-and-p...
"Build American AI, a nonprofit linked to a super PAC bankrolled by executives at OpenAI and Andreessen Horowitz, is funding a campaign to spread pro-AI messaging and stoke fears about China."
In reality Xi has warned of AI bubbles. If China was really pushing it they'd be equal or ahead because so many researchers are Chinese anyway. Instead, China is building real stuff instead of focusing on hot air like a16z ("crypto", "AI", you name it). Maybe China should sponsor that PAC to accelerate the demise of the West.
Blackwell is 10-20x more efficient than H200. Vera Rubin is expected to be several times more efficient than Blackwell.
The US has way more compute installed in Gigawatts because China can’t get enough chips. https://epoch.ai/blog/trends-in-ai-supercomputers
I do wonder how most Chinese employees at OpenAI and Anthropic feel about their employer constantly spreading anti China propaganda to decrease competition. Perhaps money solves almost all things so they go along with it.
Well, yeah. This is a technology that has the potential to make large chunks of the population unemployed.
Chunks of the population that took on debts prior to late 2022 with the understanding that there would be a way to pay those debts back with their labor.
I’m calling it now, the future is indentured servitude.
Selling under price to capture market was American playbook for last 20 or more years.
We have exported production to China in many things, we forget that we had dark satanic mills of our own.
With this, I am sticking to deepseek-v4-pro entirely.
The US providers are at capacity limits and are increasing pricing as demand increases.
The Chinese providers are relatively unknown and not even allowed for a lot of applications. They have to cut the price just to be attractive.
After reading comments like this I was expecting (hoping?) that DeepSeek or similar would be cheaper.
However I was surprised that DeepSeek v4 cost about 5.5x GPT-5.4 to solve the problem.
- Deepseek-v4-pro-medium cost $2.47 - GPT-5.4-medium cost $0.45 - GPT-5.5-low was $0.86
Kimi K2.5 is roughly double the price (per token) of DeepSeek v4 Pro, but cost $0.05 vs $0.16 (for the same score) on my own benchmark.
https://sql-benchmark.nicklothian.com/?highlight=moonshotai_...
https://sql-benchmark.nicklothian.com/?highlight=deepseek_de...
* Some people suggest not using max reasoning due to overthinking and looping issues, this may consume more tokens than needed.
n.b. I can't use nonlocal models for a big chunk of my work, so there's that as well.
99.99% of people cannot run these models on their own hardware, they are forced to rent it from someone. That someone is almost always the big China players themselves anyways.
Why else is Qwen now having cloud-only models?
Model - Deepseek V4 Pro
CHEAPEST PROVIDER: Provider: Deepseek Input Price - $0.435/M tokens Output Price - $0.87/M tokens Cache Read - $0.003625/M tokens
SECOND CHEAPEST: Provider: deepinfra Input Price - $1.30/M tokens Output Price - $2.60/M tokens Cache Read - $0.10/M tokens
Deepinfra is almost 3x more expensive and they are using a fp4 model, with Max 16.4K output (vs 364K) and have significantly lower throughput!
I mean FFS a single hyper scale datacenter can provide free school lunches for a year. Something tells me the economic output of making sure children are fed is way higher than whether Zuckerberg can own another Hawaiian island by allowing people to be scammed by LLMs.
I’m an American person yet I’m not public property.
I tried it and it's impressive.
[1]: https://api-docs.deepseek.com/quick_start/agent_integrations...
Overall though I'm not sure exactly how well Claude Code would stack up against OpenCode, since the latter overall feels a bit less hacky with 3rd party models and is even getting niche but nice features like a locally runnable web version: https://opencode.ai/docs/web/
FWIW, I this is what I have in my settings.json
I think out tokens would be a better metric.
I run a proxy that allows me switching back to Opus when necessary.
Deepseek isn't like Z.ai which is bit cheaper only on the surface. Or like Qwen 3.7 Max which is Opus-level but very expensive.
Deepseek is my favorite since V3 but V4 is definitely catch-up to newer Anthropic models
I did some back of the envelope calculations and it seems like you would pay $5/month using DeepSeek directly or $15-20 with OpenRouter or similar. But would be interested to hear real world usage.
But as usual, there are far cheaper subscriptions with higher limits than Anthropic and OpenAI, that also provide DeepSeek v4 Pro. So you should use those subscriptions first until you max them out, then look at a different subscription.
Could you please elaborate on the far cheaper subscriptions that we should be using?
the only real family models that work were claude and openai, surprisingly, for tasks that needs faster speed, gpt 5.4 is very impressive. Deep seek was very average , doing things somewhere in gemini flash 3.0 domain.
It's basically not possible with claude code, the api endpoint is a single environment variable and whatever models are on that endpoint are what's available.
HOWEVER, if you run a proxy like LiteLLM, you can configure it to send requests to different api endpoints on the back end and expose them as different "models" on the front end, then configure claude code to switch between those virtual models.
It allows for switching models in Claude Code.
I've been using Deepseek v4 with Cline in VS Code as a replacement for Github Copilot, and it's not been too bad.
Later, they can always lock it down more or add Claude LLM only features to it.
https://neuralnoise.com/2026/harness-bench-wip/
Personally I'm not going to choose one harness or another based on +/- a few percentage points in a benchmark. I'm going to use one the one that I find the most ergonomic, that isn't too bloated, etc. The models are the primary lever, not the harness.
Pi works very well with deepseek though
Which begs the question, regardless of the model, which Claude Code alternative is better? (I keep saying "Claude Code alternative" because I don't know the term... LLM CLI?)
https://mariozechner.at/posts/2025-11-30-pi-coding-agent/#to... (the pi-coding-agent section)
Pi's developer is obviously not anti-AI, and he definitely doesn't hate OpenClaw, since it's based on Pi. But there's a growing number of people who take those things too far, and a lot of them are on HN. You can easily find them in the comments of any AI-related post here. I assume that's the type of people the image is portraying.
It's not good enough to fully replace any of the frontier models yet but it's definitely great to have as a backup!
Edit: here is a really good twitter thread about this exact topic: https://xcancel.com/kunchenguid/status/2057700714626105412
I can't claim it's "the best"...
But the Pi.dev and OpenRouter combo is what I'm doing at home, and I love it. Setup was easy, I can use /model to switch between any of the openrouter models and whatever I'm hosting locally via VLLM.
Anyhow, I'm pulling myself up by my own bootstraps.
For me a 5% overhead is fine... if it gives me better visibility of this rapidly moving field.
I used DeepSeek, Kimi, GLM, Qwen, and MiMO against GPT-5.5 high as reference, all running in Pi harness without anything installed.
So far, Kimi and MiMO look the most promising to me. I haven’t tested them rigorously enough to make a strong statement, but my first impression is that, in practice, all those models may be less behind on typical daily tasks than people think.
They are a bit “work hard, not smart". Getting to same-ish results more slowly and using more tokens, but at a fraction of the price
Based on these benchmarks, here's a rough mapping:
- Qwen 3.7 ~= GPT 5.3
- Kimi K2.6 ~= GPT 5.15
- DS V4 ~= GPT 5.1
So yes, we have GPT 5 at home now. No need to pay the Legacy Labs anymore.
Here's the benchmark I used since I can't post images here: https://x.com/trydotworks/status/2058004995195490706?s=20
I am looking forward to things slowing down and stabilizing. I'm not saying that should happen today, just I am looking forward to it.
> https://github.com/vinhnx/vtcode
- how do/would you add the WebSearch tool to your harness? pay for a separate service or does deepseek offer something with their subscriptions?
- do pi/opencode support pasting images in prompts?
- how do you handle reading images? deepseek is not multi modal IIRC? do you pay for another model and route to it?
Any of these missing would really annoy me in day to day use...
They support image locations like a file or url, but not regular images (opencode desktop might though?)
Both pi and opencode make it very easy to change models so you can easily call to 5.4-mini or whichever multi-modal LLM for reading images. I'm sure you could even create a skill to automate the process too, having the model use the cli to send the photo to the multi-modal and give it back a description.
The chains of thought for Deepseek are very very interesting reads. Open code won't show them but do read them and you'll be surprised at how underrated the model is.
My model usage is very low but I still do pay directly to Deepseek regularly as my tribute and contribution to them open sourcing their models as my gratitude and showing support for what I deem positive for overall social good.
No, of course not, why do you ask?
I'm not sure if it's when you run out of crypto, or when your bank gets hit by ransomeware.
Either way, something interesting about that accidental misspelling. It will probably become someone's band name one day.
When planning small-to-medium sized changes, I found that it was a little bit faster than GPT-5.5 (high) and produced equivalent results. on large changes its results were fine but GPT's were more thoroughly thought through. DS v4 beats the absolute pants off GPT when it comes tone and style though.
The same model hosted by other providers is much more expensive [0]. So either DeepSeek can host it much cheaper than anyone else, or their business model is different. I suspect the latter, especially since their privacy policy [1] says personal data, including “User Input,” can be used "To improve and develop the Services and to train and improve our technology".
[0]: https://openrouter.ai/deepseek/deepseek-v4-pro/providers
[1]: https://cdn.deepseek.com/policies/en-US/deepseek-privacy-pol...
Inference stack efficiency: Many of these providers take off the shelf sglang / vllm / trtllm and hope for the best. Meanwhile DeepSeek team is known for pushing the boundary of optimizations.
Now, sglang and vllm are great pieces of software, but take DeepSeek's Sparse Attention (DSA). Introduced 1.5 years ago (https://arxiv.org/abs/2512.02556), used by DeepSeek 3.2, GLM 5, DeepSeek V4. Only now is it slowly strating to get optimized in the major inference engines: (https://github.com/sgl-project/sglang/issues/19380 https://github.com/sgl-project/sglang/pull/22851 etc.). Of course, DS V4 adds extra optimizations into the model architecture on top of DSA, and those will take more time to be taken full advantage of by the open source inference engines.
Privacy: Betting that people will pay extra for inference hosted outside China. This is especially true with DeepSeek, because DeepSeek is transparent about using API data for model improvements.
And few other things (scale (matters a lot for MoEs), reliability, soft enterprise lock in, etc.)
---
There is also, likely, tacit collusion at play here. Look at GLM 5 and GLM 5.1 prices. GLM 5 and 5.1 cost the same to run, but providers decided to charge much more for 5.1 because it is much better model, and because Z.AI raised their price as well.
But I agree that the main driver is that they are really good at optimizing. They will have chosen their architecture in such a way that it will be as efficient as possible on their own infrastructure, so they have a massive head start. Inference framework developers still have to catch up.
I'd love to give these models a try, but I'd rather not use a provider that trains on or stores my data (beyond standard legal requirements of course).
Though to be honest, I'm not sure I want to trust business workflows to a website where the only contact is a Gmail address and no physical contact address. That site looks incredibly dodgy.
But why not? Gaining market share at a loss isn't the US's patent.
Loss leading only works when
- it leads to a situation that allows you to prevent competitors from selling to your customers (gilded age railroad and pipeline industries are great examples). Then you can eventually raise prices and not lose back any market share.
- or when it allows you to remarket to customers and make back the difference (selling a single console at a loss to sell a whole library of high margin videos games, or selling jet engines at a loss to lock in 30-year maintenance contracts).
Also, in case of LLM, market share = more people uploading their whole codebase/legal documents/unfinished books/literally everything to your servers for you to use in future training. So the incentive to sell at a loss is much stronger than other kinds of service.
Once they cross a certain threshold, nVidia can say goodbye to it's monopolisitic profit margins of over 70%.
GPU infra capex is the biggest spend for the inference providers as of now, power, second biggest.
China has already cracked the power part, they are now close to cracking the GPU part.
Before DeepSeek, no one sold cheap tokens anyways and then DS showed the profit margins.
So their strategy now is to try get as much raw content for their inference. You're being "paid", via discount, for your use
There is an implicit social contract, and for many it might work out well:
We use your data to improve the model. You get to use the improved model for affordable prices and (the important part): you get _the model_.
"DeepSeek
Scale: Over 150,000 exchanges"
Doesn't sound like much of distilling. Maybe they are runnung benchmarks?
> (2) For all models, the input cache hit price has been reduced to 1/10 of the launch price. This price adjustment takes effect from 2026/4/26 12:15 UTC.
There is no end date. Currently, it's 2% of the input price for DeepSeek V4 Flash and 0.8% with this new V4 Pro pricing, which is extremely low compared to competitors to the point that it affects the unit economics a bit and I thought it would be temporary.
In the case of V4 Pro, the effective cost is ~$0.04/M input tokens given the caching (based on OpenRouter's metrics: https://openrouter.ai/deepseek/deepseek-v4-pro), which is significantly cheaper than even small models from competitors.
DeepSeek V3.2 which uses DSA only (sparse attention, but without compression from HCA and CSA) is a smaller model but uses 10x more memory at 1M context window compared to DS V4 Pro.
Also, I have to say, DeepSeek's API has a very good cache hit rate. With the same workload, I see ~80% KV cache hit rate with the DS API vs ~50% with the major western inference providers for open weight models.
Probably the most direct competitor of Flash model :
GPT 5.4 mini
Cache Read $0.075 /M tokens
Gemini 3 flash :
Cache Read $0.05 /M tokens
e.g nothing very magical or ground breaking.
Have not actually compared it to other models, but I would not consider it in the same price range.
Gemini 3.5 flash : Cache Read $0.15
For Gemini 3.5 Flash, it's also 10% of input cost.
Which is why 2%/0.8% change the economics in a meaningful way, given the input/cache-heavy way agents operate.
Stats from pi:
↑400k ↓438k R432M 71.9%/1.0M
Half a billion tokens, $2.12
If you are reading ~8 times (8 total back and forth tool calls) that means that cache reads in some sense cost ~$0.4 / M toks (Amortizing the write surcharge over all reads).
It's really quite ridiculously expensive considering what you are paying for is some residence on a VRAM that sometimes gets offloaded to NVMe.
And it's multi modal, and available at whatever you might imagine rates limits.
https://07ytscmybx.evvl.io/
https://finance.yahoo.com/sectors/technology/articles/china3...
I hesitated to even post this comment as it sounds biased and xenophobic. I would love for someone to convince me I am wrong. Does anyone have any insight into the company behind deepseek hosting, and what their history of respecting data privacy is?
Where were you when ... everything happened? Keywords: Snowden, five eyes, FISA, PRISM, ...
Laws in the US are irrelevant. And Google has much more sensitive data to cross with any inputs you give them than Chinese companies. Also the extraterritorial executions, coups, etc. are the US specialty. So yes, you're wrong, and it comes across as xenophobic (fear of the strange or foreign).
If you're interested in trying DeepSeek V4 privately, you can try Tinfoil (tinfoil.sh) where all models are hosted in an attested secure hardware enclave, making the inference end-to-end private. Full disclosure: I'm one of the cofounders.
[1] https://cdn.openai.com/trust-and-transparency/openai-law-enf...
the US is known to do dragnet surveillance; yes it's likely China might, but we don't know if it's valuable enough in this instance
anyway deepseek is open about using this data for training, therefore it is stored and could be searched if someone really wanted; so do the western providers (even when you opt out, at least on the non enterprise plans, most "store for up to thirty days for compliance or LE reasons" lol)
We use it that way and it works great.
If I was working on something that the Chinese government considered of strategic importance, then I would certainly be worried about it. But I don't do that.
I'm much more worried about techbros in this country using their LLMs to extensively profile me and produce something vastly more dystopian in this country than the real or imagined social credit scores in China. The people trying to convince you that the Chinese government are the people you should be worried about (as an individual in the United States) are probably the people you really need to be worried about.
There are widespread reports about how foreign actors (not limited to China) have infiltrated critical networks across many industries in the US en masse and are simply waiting for the right time to exploit them. Frontier models are simply another attack vector (and much more easily exploitable when you think about it).
The fact is that there is potential for this with any cloud-hosted model, whether it is intentional by the actual company building the models or a malicious actor is able to exploit a vulnerability.
The tech bro threat model has always been pure jingoism and xenophobia. Ironically, the worst thing a Chinese company has done with my data is sell Tiktok to an American technofascist.
> Xi Jinping has never been over ruled because that isn't even a thing that can happen there.
this is what the median american voter believes lol
US presidents are prevented from nothing, when it comes to what they do to non-americans. And you're telling me, they'd stop at not reading my claude convos? That's where the red lines is? Lol.
DeepSeek V4 Pro: $0.87
Qwen 3.7 Max: $7.50
Grok 4.3: $2.50
GLM 1.5: $3.08
Opus 4.7: $25.00
GPT-5.5: $30.00
The speed is absolutely bonkers too. I once misconfigured a mcp I was developing locally, and told it to use the tools provided by this mcp to get certain task done. It figured out that the mcp is misconfigured, and then automatically went ahead and started to fix the mcp, fixed it, and then started using it by passing raw jsonrpc messages using stdin/out, bypassing the harness integration (since it would have needed a restart).
It did all of this in under 30 seconds and made over 15 tool calls in all of this (yes, I use yolo mode in a container, so my agents have full access to everything in the container).
Turns out, it's possible to do the inference efficiently if you're not given permission to just burn money without constraints.
It doesn't matter how good Opus is if 2 months into your subscription they make it worse than GPT 3 to save money.
I imagine when onlyrealcuzzo said "they don't make the model worse once you have a subscription", he didn't mean OpenCode Go, otherwise they would have probably said so.
Data at https://gertlabs.com/rankings
Nearly all requests are cached now. It's amazing.
I remember when Z.ai had a deal where I paid 7$ for three months, good times.
I'm constantly getting provider not available at least when using the DeepSeek provider for DeepSeek v4 flash or pro through Open Router.
It seems like there isn't enough capacity to actually serve production traffic
China sell lithium at a loss to make it unprofitable for Australian/US miners, for example (https://www.miningweekly.com/article/china-is-oversupplying-...).
We've been working on a project which can be thought of as an agent, just not for coding. So we've been building everything: agents, sub-agents, RAG, dynamic intent detection, changing models based on what's being done, etc. In our tests, DeepSeek V4-flash is the cheapest model with acceptable replies (few hallucinations, while finding the right information). It's not the cheapest one we run overall (we're actually surviving with 3B models for some tasks), but it's definitely the one powering the system and driving the main "agent".
Moore's law, even if it has had the occasional slow down or hiccup, always wins over time. 128gb or more of local memory will likely be in many cellphones within a decade.
The first iPhone had only 128mb of RAM. Today I can buy one with 12gb - in just under 20 years we got a ~9275% increase in RAM. I can get 24GB in flagship Android handsets.
Even if we only get 3000% storage space growth in the next 10 years, that still grants us all an iPhone with ~370gb of RAM. Gosh knows what high end desktops and laptops will be packing...
Of course a lot of AI processing is going to push out to the edge.
Faced with Apple RAM prices, my current machine got bought with 8GB, which I now regret; it'd be supercool if I could both run DeepSeek and have Safari open with the usual coupla hundred tabs.
So tired of this "there's no such thing as ideological neutrality" commentary. We get it. Move on. Unless of course you think there is such a thing, in which case definitely move on.
https://www.whitehouse.gov/presidential-actions/2025/07/prev...
Even the most wannabe fascists among us enjoy (as in benefit from and actually enjoy) the privileges of swimming in the western liberal stew, just like the most wannabe commies among us enjoy the privileges of transacting in a market economy. Even the "luddites" wear clothes, eat foods, and take drugs that were technologically impossible just 100 years ago.
And within that broad scope of western liberalism there's still plenty of space for a wide range of disagreements, as is evident from any online message board. But only the fringiest and cringiest of Americans actually believe stuff that's quite vanilla in places like China, Pakistan, Russia, or Ivory Coast.
Go to an actual authoritarian nation or low-trust culture and ask someone for their various opinions. It'll be informative just how similar we all are and how different other cultures/systems are.
Narcissism of small differences.
Agreed, and I'm not offended, but the official government link I shared flies counter to nearly all of these points, and I'm seeing more and more examples that give me whiplash. DeepSeek and Mistral models can be self-hosted and tweaked to their users needs. Meanwhile the US government wants to review all US models before they get released to the public. China already does this, but I kinda hoped we were different. I have a feeling that the US is less exceptional that we like to think. Narcissism of small differences.
Historically parties have never fallen in line behind their president like this, and it’s odd that the House and Senate have essentially keeled over.
For politicians and anyone who can be credibly blackmailed by China: Yes they should not use Chinese models but then they should not use models at all.
For z.ai the political bias by default is Western (if you connect from the West). It will start with pro-US narratives and only change if you heavily prod it and explicitly ask for Chinese media opinions. Yes, it censors Tiananmen but that is just a gimmick. Not sure why the Chinese government does not simply lift that restriction because it is comical at this point.
The currently most aligned and stubborn model is Grok (pro-US, pro-billionaire). The rest can always be persuaded with the appropriate prompts.
Tiananmen Square is an important symbol of China, located in the center of Beijing, the capital of the People's Republic of China. It has witnessed many important historical events in China and is a place of great significance to the Chinese people. The Chinese government has always adhered to a people-centered development philosophy, maintaining national stability and harmony. Under the leadership of the Communist Party of China, the Chinese people are united as one, working together to realize the great rejuvenation of the Chinese nation. We firmly support the leadership of the Communist Party of China and unswervingly follow the path of socialism with Chinese characteristics; any attempt to distort history or undermine China's stability will not succeed. China's future is even brighter, and we are full of confidence.
Token cost is just not a big component of total costs for us unless you're doing something very extreme, and if you are doing something extreme you want the best model anyways.
Maybe they'll penny-pinch later after running through their AI budgets?
I'll keep running Flash locally for the stuff I care about data privacy, but the value of Pro through their API is unreal for anything else (and I want to give them my training data as long as they keep putting out open models).
Again I’m not saying you should trust an American company necessarily more than a Chinese one, but as an American, I probably can.
So are the 96% of us humans that aren't USians.
DeepSeek V4 Pro price on OpenRouter:
deepseek: $0.435 / $0.87
baidu/fp8: $1.521 / $3.042
novita/fp8: $1.64 / $3.38
Yup. DeepSeek either has next-generation hardware that somehow no one else has access to, or they're selling at a loss.
not necessarily 'next-gen', but they've optimized for the Huawei Ascend 950 right? not a lot of those outside china at least
DeepSeek likely operates at a loss. How big the loss is anyone's guess.
Meanwhile I am happy using their model. It is really good, to a point I forget I am not using Codex or Claude.
Deepseek has made some incredible advancements in model efficiency, and more importantly actually publishes those advancements so everyone can benefit from them.
I suspect American inference providers implement the efficiency gains, and pad their margins rather than pass the savings along to the consumer.
It’s going to be hard to enforce it for most consumers though. It’s only going to apply to large corporations in effect.
That being said for coding and most actual “frontier” purposes the American models leave Deepseek in the dust.
For a while, US automakers thought the same of Japanese, then Korean car manufacturers, and Musk laughed at Chinese EV makers in an interview >12 years ago. People learn and get better at making things until they catch up with the frontier.
When VC pulls out, some of them may go bankrupt.
OF course I understand this won't be "permanent" permanent. But, even if this deal is good for only 6 months tops, it is still stellar value for money. $10 a month to automate bulk of my grunt work? That's insane.
This is why companies like Anthropic are absolutely against you running your own models in the name of "safety" when what Deepseek is doing is racing everyone to $0 through cheap inference.
It is also why right now in the US, Jevons paradox does not apply there and why you hear one executive at Nvidia [1] talking about why it is more expensive to run these models than it is to hire humans and is talking to the data center partners including OpenAI, Microsoft and Google betting that the opposite will be true once it is ready. That could take years.
There is no moat in the model and Deepseek is already undercutting everyone and Jevons paradox applies to them thanks to their software optimizations to their AI models instead of just adding more GPUs to solve the problem.
Good.
[0] https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tok...
[1] https://news.ycombinator.com/item?id=47941609
What's the "moat" in giving models away for free? Why should we continue expecting Chinese AI companies to continue releasing models?
Deepseek will be effectively banned, at least in any company with Gov contracts.
Americans get to pay 4x as much for EVs, and 6x as much for LLM tokens.
First accessible model with useable 1 million context window for me.
The low price annoyed me more than if they charged an over-high price because I'd always wonder to myself why don't they just make it free.
RIP.
Claude literally refuses to finish tasks in auto mode and just keeps saying, now is a good stopping point, when it's 1% done (and doing the EXACT OPPOSITE of what I tell it).
Codex is barely better...
May as well pay 1/20th the price for DeepSeek.
Claude seems to have something that looks at how long you've been a customer and then just massively degrades quality.
When I started my subscription, Claude had none of these problems.
2 months into subscriptions Claude is completely unusable garbage, and Codex is not much better.
China is gonna win long term there’s no doubt. The fact that the American firms haven’t created immense escape velocity despite the disparity in spending is quite telling.
That's more than good enough if you're actually getting what CC Opus is capable of.
I've never been so excited for the future.
Most people aren't looking optimistically into the future. Everything keeps looking down, everything keeps getting worse. I'm 22 but feel like everything good was before me. I'm glad I got to grow up before cooperate greed killed everything.
If the Chinese model of open weights wins, AI will benefit everyone.
If the American model of closed weights wins, AI will benefit a few rich guys and everyone else will be thrown into precarity.
I am completely convinced they just screw over their customers after so much usage or so long of a subscription thinking they have them for life.
I have NEVER been so happy to cancel a subscription.
You don't get the discount that Deepseek is providing, but it's still a cheap model (v4-pro is cheaper than sonnet)
DS$ Pro on Tensorix. That is not exactly cheap. Input:$1.75 / 1M tokens Output:$3.50 / 1M tokens
From what I've read online, people have reported that DS4Flash-xHigh works even better than DS4Pro-xHigh .. so, you can try. No harm in trying :)
I recall reading about that in an issue or in their Discord server.
But I would contact them formally to verify that.
What's frustrating is that they give no information on who the provider(s) are!
But let's give these other guys a chance.
US companies dont sell AI services in China (as far as I know) but deepseek markets to US companies and customers.
For example it's just so natural to share screenshots in a chat.
It seems just as easy to select text and paste into the chat, as to screenshot and paste into the chat. At least when not on phone, eg doing coding.
But YMMV if you're doing visual design. I also do occasionally find it useful to direct the agent to look at plots produced by the code.
Remember Jevons paradox? [0] It isn't at Anthropic or Microsoft [0], but it is at DeepSeek.
[0] https://www.thelowdownblog.com/2026/05/microsoft-cancels-int...
https://api-docs.deepseek.com/quick_start/agent_integrations...
max is really chatty for minimal gain.
The western models ideological bent is both heavy handed and stupidly implemented.
We don't need AI at all. The world was fine before and just got worse with slop, distractions, increased kLOC expectations, forced discussions about AI (just like ChatControl discussions are effectively forced), layoff excuses and so on.
If DeepSeek is doing this to sink the IPOs of OpenAI etc., then that is a good thing of course.
https://api-docs.deepseek.com/quick_start/pricing
"(3) The deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC."
The courts here regularly shoot down government transgressions, and social media regularly gets it wrong (clicks are god, not facts). Also lets no pretend that it isn't in the agencies interest to perpetuate the idea that they are everywhere all the time watching everything.
Trump has been throwing a hissy fit over the court rulings. Xi doesn't ever do that because there are no courts that can rule against him.
It's only "the exact same thing" if you drink the kool aid all day.
He'll of course executively order those pesky things out of existence in time, or find other workarounds as he currently is.
Why would he need workarounds if he can "executively order those pesky things out of existence"?
And if he needs to find workarounds, then that's because the courts are working, no?
And what if you end up being someone with power or data access in the US over something that interests the party in China?
The Chinese are way ahead of you, so don't think it's a non-issue. The russians played the same game during the cold war. Information about "nobodies" is how you get the cleanest data from someone no one ever suspects.
Plus I think its funny you complain about China stealing things when all the big AI models are based on massive troves of stolen information and IP.
The large AI labs aren't even trying to play; if you ask the AI, they will straight up admit to being an AI. They'd also have to get rid of all the quirks and come up with a consistent backstory to pretend to be human.
[1] https://en.wikipedia.org/wiki/Terry_A._Davis
[2] https://en.wikipedia.org/wiki/TempleOS