Author Topic: what up with this deepseek?  (Read 3185 times)

mistymoney

  • Magnum Stache
  • ******
  • Posts: 3167
what up with this deepseek?
« on: January 27, 2025, 12:10:41 PM »
how is it so cheap? what effects on us markets after today?

what is deepseek after?

MustacheAndaHalf

  • Walrus Stache
  • *******
  • Posts: 7604
  • Location: U.S. expat
Re: what up with this deepseek?
« Reply #1 on: January 27, 2025, 02:19:26 PM »
Despite restrictions on Nvidia chips, Chinese company DeepSeek has caught up and surpassed language models based on the latest Nvidia chips.  Existing AI models use pre-training and training, but apparently DeepSeek made use of "test-time scaling" to improve its models at less cost.  This calls into question the need for Nvidia's latest chips, and the investments made by big tech companies (since DeepSeek passed them by spending just $6 million to create their LLM).

It also suggests the latest AI technology might not be in the U.S., or exclusively in the U.S.  It also hints that big tech companies have wasted money building their LLMs, when a cheaper, better approach is possible.  I think Nvidia is taking a hit because their latest chips have been upstaged, and big tech is taking a hit for relying heavily on those chips.

https://www.cnbc.com/2025/01/27/nvidia-falls-10percent-in-premarket-trading-as-chinas-deepseek-triggers-global-tech-sell-off.html

FireLane

  • Handlebar Stache
  • *****
  • Posts: 1639
  • Age: 43
  • Location: NYC
Re: what up with this deepseek?
« Reply #2 on: January 27, 2025, 02:27:56 PM »
As I said (more snarkily) in another thread, this is an interesting test case of whether AI evangelists believe their own hype.

If you really believe that AI will go on learning and improving, making us exponentially more productive, culminating in a post-scarcity utopia where all work is done by machines, this is great news!

We can now obtain the same capabilities for much cheaper. That should make stocks go up, since it means there are even larger rewards for investing in companies that use this technology.

On the other hand, if you're a snake-oil salesman who's cynically using AI as the latest buzzword to draw in the dumb money, this is bad news. If good-enough AI models can be built this cheaply, there's no justification for sucking in billions of dollars of investor money to build bigger and bigger data centers.

If you believe in your heart of hearts that AI isn't going to change the world, but you've been saying otherwise to get people irrationally excited and to pump your stock prices... then the arrival of cheap competition is unwelcome news. It could be the pin that pops the AI bubble. In that case, you'd expect stocks to go down.

Which one did the market do?

walkingmiller

  • 5 O'Clock Shadow
  • *
  • Posts: 10
Re: what up with this deepseek?
« Reply #3 on: January 27, 2025, 03:33:14 PM »
If you are interested in a bit of a deep dive, Ben Thompson always has great insight - https://stratechery.com/2025/deepseek-faq/.

In essence, the market is reacting to innovations that many people already knew about if you were paying attention - you can make significant strides in AI performance without having to massively increase compute spending. Overall, this is probably good for AI consumers and people selling inference, but it potentially spells trouble for Nvidia's market cap.


Scandium

  • Magnum Stache
  • ******
  • Posts: 3134
  • Location: EastCoast
Re: what up with this deepseek?
« Reply #4 on: January 27, 2025, 03:39:04 PM »
what's the best way to sign up and test this without having my account hacked by the CCP? Do I a totally separate email account with no connection to any of my others?

weebs

  • Bristles
  • ***
  • Posts: 259
  • Age: 51
  • Location: The Sticks
Re: what up with this deepseek?
« Reply #5 on: January 27, 2025, 04:58:01 PM »
what's the best way to sign up and test this without having my account hacked by the CCP? Do I a totally separate email account with no connection to any of my others?

Hack someone else's account and use it.  ;-)

reeshau

  • Magnum Stache
  • ******
  • Posts: 3745
  • Location: Houston, TX Former locations: Detroit, Indianapolis, Dublin
  • FIRE'd Jan 2020
Re: what up with this deepseek?
« Reply #6 on: January 27, 2025, 05:29:47 PM »
It was called put today that the paper citing the $6M coat disclaimer that it didn't include "any prior investments."  So...$6M in electricity??

For example, they detailed that they used a cluster of 2,048 NVidia H800 GPU's.  The cost of that wasn't disclosed, but is you assume a $12,000 cost (list is $22k) you are at multiples of that cost already.  And you haven't even licensed it or your target content, or coded anything yet.

Still a great achievement, and documented difference in approach.  As open source, other AI competitors will take a look at it and incorporate what works.

Fru-Gal

  • Handlebar Stache
  • *****
  • Posts: 2200
Re: what up with this deepseek?
« Reply #7 on: January 27, 2025, 05:34:48 PM »
I posted this in the Top is In thread but repeat it here:

Quote
I LOVE these AI oligarchs crying about DeepSeek. BTW you can run DeepSeek locally, UNLIKE fucking OpenAI’s shit. THIS is why they were all ass-kissing Trump: Regulatory capture. As Google said a few years ago, THERE IS NO MOAT. And now here we are. I’m loving it!!!

Also note that when you run DeepSeek locally (not in the cloud) you own your data!!!

The tech oligarchs are experiencing that FAFO sensation of having abused our trust for so many years that their Sinophobia isn’t working against DeepSeek!

Here’s the code to run it locally.
From https://github.com/deepseek-ai/DeepSeek-V3

Quote
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks.

Here’s the “no moat” memo. This is why Altman’s been fear-mongering to Congress.

From: https://semianalysis.com/2023/05/04/google-we-have-no-moat-and-neither/

[sorry about the formatting]

May 4, 2023
Google “We Have No Moat, And Neither Does OpenAI” Leaked Internal Google Document Claims Open Source AI Will Outcompete Google and OpenAI


Quote
We Have No Moat
And neither does OpenAI

We’ve done a lot of looking over our shoulders at OpenAI. Who will cross the next milestone? What will the next move be?

But the uncomfortable truth is, we aren’t positioned to win this arms race and neither is OpenAI. While we’ve been squabbling, a third faction has been quietly eating our lunch.
I’m talking, of course, about open source. Plainly put, they are lapping us. Things we consider “major open problems” are solved and in people’s hands today. Just to name a few:
LLMs on a Phone: People are running foundation models on a Pixel 6 at 5 tokens / sec.

Scalable Personal AI: You can finetune a personalized AI on your laptop in an evening.

Responsible Release: This one isn’t “solved” so much as “obviated”. There are entire websites full of art models with no restrictions whatsoever, and text is not far behind.

Multimodality: The current multimodal ScienceQA SOTA was trained in an hour.

While our models still hold a slight edge in terms of quality, the gap is closing astonishingly quickly. Open-source models are faster, more customizable, more private, and pound-for-pound more capable. They are doing things with $100 and 13B params that we struggle with at $10M and 540B. And they are doing so in weeks, not months. This has profound implications for us:
We have no secret sauce. Our best hope is to learn from and collaborate with what others are doing outside Google. We should prioritize enabling 3P integrations.

People will not pay for a restricted model when free, unrestricted alternatives are comparable in quality. We should consider where our value add really is.

Giant models are slowing us down. In the long run, the best models are the ones

which can be iterated upon quickly. We should make small variants more than an afterthought, now that we know what is possible in the <20B parameter regime.


https://lmsys.org/blog/2023-03-30-vicuna/
What Happened
At the beginning of March the open source community got their hands on their first really capable foundation model, as Meta’s LLaMA was leaked to the public. It had no instruction or conversation tuning, and no RLHF. Nonetheless, the community immediately understood the significance of what they had been given.
A tremendous outpouring of innovation followed, with just days between major developments (see The Timeline for the full breakdown). Here we are, barely a month later, and there are variants with instruction tuning, quantization, quality improvements, human evals, multimodality, RLHF, etc. etc. many of which build on each other.
Most importantly, they have solved the scaling problem to the extent that anyone can tinker. Many of the new ideas are from ordinary people. The barrier to entry for training and experimentation has dropped from the total output of a major research organization to one person, an evening, and a beefy laptop.
Why We Could Have Seen It Coming
In many ways, this shouldn’t be a surprise to anyone. The current renaissance in open source LLMs comes hot on the heels of a renaissance in image generation. The similarities are not lost on the community, with many calling this the “Stable Diffusion moment” for LLMs.
In both cases, low-cost public involvement was enabled by a vastly cheaper mechanism for fine tuning called low rank adaptation, or LoRA, combined with a significant breakthrough in scale (latent diffusion for image synthesis, Chinchilla for LLMs). In both cases, access to a sufficiently high-quality model kicked off a flurry of ideas and iteration from individuals and institutions around the world. In both cases, this quickly outpaced the large players.
These contributions were pivotal in the image generation space, setting Stable Diffusion on a different path from Dall-E. Having an open model led to product integrations, marketplaces, user interfaces, and innovations that didn’t happen for Dall-E.
The effect was palpable: rapid domination in terms of cultural impact vs the OpenAI solution, which became increasingly irrelevant. Whether the same thing will happen for LLMs remains to be seen, but the broad structural elements are the same.
What We Missed
The innovations that powered open source’s recent successes directly solve problems we’re still struggling with. Paying more attention to their work could help us to avoid reinventing the wheel.
LoRA is an incredibly powerful technique we should probably be paying more attention to
LoRA works by representing model updates as low-rank factorizations, which reduces the size of the update matrices by a factor of up to several thousand. This allows model fine-tuning at a fraction of the cost and time. Being able to personalize a language model in a few hours on consumer hardware is a big deal, particularly for aspirations that involve incorporating new and diverse knowledge in near real-time. The fact that this technology exists is underexploited inside Google, even though it directly impacts some of our most ambitious projects.
Retraining models from scratch is the hard path
Part of what makes LoRA so effective is that – like other forms of fine-tuning – it’s stackable. Improvements like instruction tuning can be applied and then leveraged as other contributors add on dialogue, or reasoning, or tool use. While the individual fine tunings are low rank, their sum need not be, allowing full-rank updates to the model to accumulate over time.
This means that as new and better datasets and tasks become available, the model can be cheaply kept up to date, without ever having to pay the cost of a full run.
By contrast, training giant models from scratch not only throws away the pretraining, but also any iterative improvements that have been made on top. In the open source world, it doesn’t take long before these improvements dominate, making a full retrain extremely costly.
We should be thoughtful about whether each new application or idea really needs a whole new model. If we really do have major architectural improvements that preclude directly reusing model weights, then we should invest in more aggressive forms of distillation that allow us to retain as much of the previous generation’s capabilities as possible.
Large models aren’t more capable in the long run if we can iterate faster on small models
LoRA updates are very cheap to produce (~$100) for the most popular model sizes. This means that almost anyone with an idea can generate one and distribute it. Training times under a day are the norm. At that pace, it doesn’t take long before the cumulative effect of all of these fine-tunings overcomes starting off at a size disadvantage. Indeed, in terms of engineer-hours, the pace of improvement from these models vastly outstrips what we can do with our largest variants, and the best are already largely indistinguishable from ChatGPT. Focusing on maintaining some of the largest models on the planet actually puts us at a disadvantage.
Data quality scales better than data size
Many of these projects are saving time by training on small, highly curated datasets. This suggests there is some flexibility in data scaling laws. The existence of such datasets follows from the line of thinking in Data Doesn’t Do What You Think, and they are rapidly becoming the standard way to do training outside Google. These datasets are built using synthetic methods (e.g. filtering the best responses from an existing model) and scavenging from other projects, neither of which is dominant at Google. Fortunately, these high quality datasets are open source, so they are free to use.
Directly Competing With Open Source Is a Losing Proposition
This recent progress has direct, immediate implications for our business strategy. Who would pay for a Google product with usage restrictions if there is a free, high quality alternative without them?
And we should not expect to be able to catch up. The modern internet runs on open source for a reason. Open source has some significant advantages that we cannot replicate.
We need them more than they need us
Keeping our technology secret was always a tenuous proposition. Google researchers are leaving for other companies on a regular cadence, so we can assume they know everything we know, and will continue to for as long as that pipeline is open.
But holding on to a competitive advantage in technology becomes even harder now that cutting edge research in LLMs is affordable. Research institutions all over the world are building on each other’s work, exploring the solution space in a breadth-first way that far outstrips our own capacity. We can try to hold tightly to our secrets while outside innovation dilutes their value, or we can try to learn from each other.

Individuals are not constrained by licenses to the same degree as corporations
Much of this innovation is happening on top of the leaked model weights from Meta. While this will inevitably change as truly open models get better, the point is that they don’t have to wait. The legal cover afforded by “personal use” and the impracticality of prosecuting individuals means that individuals are getting access to these technologies while they are hot.
Being your own customer means you understand the use case
Browsing through the models that people are creating in the image generation space, there is a vast outpouring of creativity, from anime generators to HDR landscapes. These models are used and created by people who are deeply immersed in their particular subgenre, lending a depth of knowledge and empathy we cannot hope to match.
Owning the Ecosystem: Letting Open Source Work for Us
Paradoxically, the one clear winner in all of this is Meta. Because the leaked model was theirs, they have effectively garnered an entire planet’s worth of free labor. Since most open source innovation is happening on top of their architecture, there is nothing stopping them from directly incorporating it into their products.
The value of owning the ecosystem cannot be overstated. Google itself has successfully used this paradigm in its open source offerings, like Chrome and Android. By owning the platform where innovation happens, Google cements itself as a thought leader and direction-setter, earning the ability to shape the narrative on ideas that are larger than itself.
The more tightly we control our models, the more attractive we make open alternatives. Google and OpenAI have both gravitated defensively toward release patterns that allow them to retain tight control over how their models are used. But this control is a fiction. Anyone seeking to use LLMs for unsanctioned purposes can simply take their pick of the freely available models.
Google should establish itself a leader in the open source community, taking the lead by cooperating with, rather than ignoring, the broader conversation. This probably means taking some uncomfortable steps, like publishing the model weights for small ULM variants. This necessarily means relinquishing some control over our models. But this compromise is inevitable. We cannot hope to both drive innovation and control it.
Epilogue: What about OpenAI?
All this talk of open source can feel unfair given OpenAI’s current closed policy. Why do we have to share, if they won’t? But the fact of the matter is, we are already sharing everything with them in the form of the steady flow of poached senior researchers. Until we stem that tide, secrecy is a moot point.
And in the end, OpenAI doesn’t matter. They are making the same mistakes we are in their posture relative to open source, and their ability to maintain an edge is necessarily in question. Open source alternatives can and will eventually eclipse them unless they change their stance. In this respect, at least, we can make the first move.

The Timeline
Feb 24, 2023 – LLaMA is Launched
Meta launches LLaMA, open sourcing the code, but not the weights. At this point, LLaMA is not instruction or conversation tuned. Like many current models, it is a relatively small model (available at 7B, 13B, 33B, and 65B parameters) that has been trained for a relatively large amount of time, and is therefore quite capable relative to its size.

March 3, 2023 – The Inevitable Happens
Within a week, LLaMA is leaked to the public. The impact on the community cannot be overstated. Existing licenses prevent it from being used for commercial purposes, but suddenly anyone is able to experiment. From this point forward, innovations come hard and fast.

March 12, 2023 – Language models on a Toaster
A little over a week later, Artem Andreenko gets the model working on a Raspberry Pi. At this point the model runs too slowly to be practical because the weights must be paged in and out of memory. Nonetheless, this sets the stage for an onslaught of minification efforts.

March 13, 2023 – Fine Tuning on a Laptop
The next day, Stanford releases Alpaca, which adds instruction tuning to LLaMA. More important than the actual weights, however, was Eric Wang’s alpaca-lora repo, which used low rank fine-tuning to do this training “within hours on a single RTX 4090”.

Suddenly, anyone could fine-tune the model to do anything, kicking off a race to the bottom on low-budget fine-tuning projects. Papers proudly describe their total spend of a few hundred dollars. What’s more, the low rank updates can be distributed easily and separately from the original weights, making them independent of the original license from Meta. Anyone can share and apply them.

March 18, 2023 – Now It’s Fast
Georgi Gerganov uses 4 bit quantization to run LLaMA on a MacBook CPU. It is the first “no GPU” solution that is fast enough to be practical.

March 19, 2023 – A 13B model achieves “parity” with Bard
The next day, a cross-university collaboration releases Vicuna, and uses GPT-4-powered eval to provide qualitative comparisons of model outputs. While the evaluation method is suspect, the model is materially better than earlier variants. Training Cost: $300.
Notably, they were able to use data from ChatGPT while circumventing restrictions on its API – They simply sampled examples of “impressive” ChatGPT dialogue posted on sites like ShareGPT.

March 25, 2023 – Choose Your Own Model
Nomic creates GPT4All, which is both a model and, more importantly, an ecosystem. For the first time, we see models (including Vicuna) being gathered together in one place. Training Cost: $100.

March 28, 2023 – Open Source GPT-3
Cerebras (not to be confused with our own Cerebra) trains the GPT-3 architecture using the optimal compute schedule implied by Chinchilla, and the optimal scaling implied by μ-parameterization. This outperforms existing GPT-3 clones by a wide margin, and represents the first confirmed use of μ-parameterization “in the wild”. These models are trained from scratch, meaning the community is no longer dependent on LLaMA.

March 28, 2023 – Multimodal Training in One Hour
Using a novel Parameter Efficient Fine Tuning (PEFT) technique, LLaMA-Adapter introduces instruction tuning and multimodality in one hour of training. Impressively, they do so with just 1.2M learnable parameters. The model achieves a new SOTA on multimodal ScienceQA.

April 3, 2023 – Real Humans Can’t Tell the Difference Between a 13B Open Model and ChatGPT
Berkeley launches Koala, a dialogue model trained entirely using freely available data.
They take the crucial step of measuring real human preferences between their model and ChatGPT. While ChatGPT still holds a slight edge, more than 50% of the time users either prefer Koala or have no preference. Training Cost: $100.

April 15, 2023 – Open Source RLHF at ChatGPT Levels
Open Assistant launches a model and, more importantly, a dataset for Alignment via RLHF. Their model is close (48.3% vs. 51.7%) to ChatGPT in terms of human preference. In addition to LLaMA, they show that this dataset can be applied to Pythia-12B, giving people the option to use a fully open stack to run the model. Moreover, because the dataset is publicly available, it takes RLHF from unachievable to cheap and easy for small experimenters.

« Last Edit: January 27, 2025, 05:46:10 PM by Fru-Gal »

Fru-Gal

  • Handlebar Stache
  • *****
  • Posts: 2200
Re: what up with this deepseek?
« Reply #8 on: January 27, 2025, 05:47:28 PM »
Quote
other AI competitors will take a look at it and incorporate what works.

People are running this on their phones!!! Buhbye ChatGPT!

Herbert Derp

  • Handlebar Stache
  • *****
  • Posts: 1390
  • Age: 34
Re: what up with this deepseek?
« Reply #9 on: January 27, 2025, 06:54:45 PM »
Overall, DeepSeek is a huge breakthrough for AI and a big win for the entire AI ecosystem. Progress and AI adoption will only accelerate from here, thanks to reduced costs and increased efficiency.

Also, the no moat thing has been known for years, that memo was from the first half of 2023!

The way I see it is, the big loser in all of this is OpenAI, Sam Altman, and their investors. Also Nvidia to a lesser extent, but I don’t see why this advancement would stop demand for their chips. Large organizations may want less chips now. But many small organizations who previously were not able to train their own models may now decide to buy chips. Most big technology companies who use AI will only benefit from this.

I am mostly happy that single entities such as OpenAI won’t be able to monopolize AI in the future. Indeed, there is no moat to protect them. I don’t think anyone should be able to monopolize that kind of power.

The anti-AI crowd shouldn’t be cheering about OpenAI’s loss, though. This advancement by DeepSeek only accelerates the progress of AI and makes it even more unstoppable. The anti-AI folks should be quaking in their boots right now, because the technology that is about to replace them just got a whole lot cheaper!
« Last Edit: January 27, 2025, 08:04:06 PM by Herbert Derp »

mistymoney

  • Magnum Stache
  • ******
  • Posts: 3167
Re: what up with this deepseek?
« Reply #10 on: January 27, 2025, 07:01:48 PM »
I posted this in the Top is In thread but repeat it here:

Quote
I LOVE these AI oligarchs crying about DeepSeek. BTW you can run DeepSeek locally, UNLIKE fucking OpenAI’s shit. THIS is why they were all ass-kissing Trump: Regulatory capture. As Google said a few years ago, THERE IS NO MOAT. And now here we are. I’m loving it!!!

Also note that when you run DeepSeek locally (not in the cloud) you own your data!!!


So - It's completely free, and  secure. How?

And Why?

Where is the profit?

Herbert Derp

  • Handlebar Stache
  • *****
  • Posts: 1390
  • Age: 34
Re: what up with this deepseek?
« Reply #11 on: January 27, 2025, 07:08:43 PM »
So - It's completely free, and  secure. How?

And Why?

Where is the profit?

It’s not free, it still costs money to buy the hardware and train / run the models. And you need to hire someone who knows how to do all of it.

It’s secure in that since it is open source, the source code can be read by anyone and be verified to not contain malware, backdoors, etc.

The people who created DeepSeek are involved with a hedge fund. Presumably, they plan to use AI to make a lot of money. The smarter that AI becomes, the more money their hedge fund will make. So they want to do whatever they can to advance the state of AI.

Also, a lot of people involved with AI don’t care about money. They care more about AI and what it means for the future of humanity. In many ways, AI is more important than trivial financial interests.

Read more here about what motivates the DeepSeek people:
https://www.chinatalk.media/p/deepseek-from-hedge-fund-to-frontier
« Last Edit: January 27, 2025, 07:18:41 PM by Herbert Derp »

Herbert Derp

  • Handlebar Stache
  • *****
  • Posts: 1390
  • Age: 34
Re: what up with this deepseek?
« Reply #12 on: January 27, 2025, 07:17:18 PM »
We can now obtain the same capabilities for much cheaper. That should make stocks go up, since it means there are even larger rewards for investing in companies that use this technology.

On the other hand, if you're a snake-oil salesman who's cynically using AI as the latest buzzword to draw in the dumb money, this is bad news. If good-enough AI models can be built this cheaply, there's no justification for sucking in billions of dollars of investor money to build bigger and bigger data centers.

If you believe in your heart of hearts that AI isn't going to change the world, but you've been saying otherwise to get people irrationally excited and to pump your stock prices... then the arrival of cheap competition is unwelcome news. It could be the pin that pops the AI bubble. In that case, you'd expect stocks to go down.

Which one did the market do?

The market is overreacting. Overall, this will benefit companies who create products that leverage AI. Almost everyone apart from OpenAI is helped by this, from startups to big tech. This is not the pin that pops the AI bubble, the AI boom will only accelerate because of this breakthrough.

Fru-Gal

  • Handlebar Stache
  • *****
  • Posts: 2200
Re: what up with this deepseek?
« Reply #13 on: January 27, 2025, 07:56:50 PM »
I posted this in the Top is In thread but repeat it here:

Quote
I LOVE these AI oligarchs crying about DeepSeek. BTW you can run DeepSeek locally, UNLIKE fucking OpenAI’s shit. THIS is why they were all ass-kissing Trump: Regulatory capture. As Google said a few years ago, THERE IS NO MOAT. And now here we are. I’m loving it!!!

Also note that when you run DeepSeek locally (not in the cloud) you own your data!!!


So - It's completely free, and  secure. How?

And Why?

Where is the profit?

Open source software is often (but not always) free, as in this case. There are many free AI tools at this point (Google’s Gemini is also free), and with DeepSeek, at least two major free, and open source, models that you can run on your own hardware. The way the software industry works is that much of the intellectual property is open source (think LAMP stack, or Kubernetes, which enabled things like ride sharing services), but then services are created around that IP that companies charge for.

What OpenAI has been doing is corrupt and unethical. They’ve been telling the world that their model is capable of causing human extinction, so only they should be trusted to develop it. They’ve lobbied the US government for regulation to ensure their first-mover advantage and secure DoD contracts (they are already working with weapons companies). There are too many more corrupt things about OpenAI to go into here. Suffice to say these guys are AFRAID of free market capitalism. And now Chinese researchers just blew their whole expensive data center and proprietary chip model out of the water. Some analysts are saying this is a Sputnik-type milestone.

All of this is GOOD. HTML doesn’t belong to a corporation and neither should AI.

You don’t need to be an expert to use these AI tools. ChatGPT already is used by tons of students, content creators, writers, researchers, etc.

I’m not predicting anything about the AI bubble (if it’s a bubble) or not. I’m just rejoicing in seeing these corrupt CEOs getting a taste of their own medicine. A fairer economy with less collusion is better for all of us. AI that runs on less energy and is democratically distributed is better for us all as well.
« Last Edit: January 27, 2025, 11:45:16 PM by Fru-Gal »

Herbert Derp

  • Handlebar Stache
  • *****
  • Posts: 1390
  • Age: 34
Re: what up with this deepseek?
« Reply #14 on: January 27, 2025, 08:17:14 PM »
I’m just rejoicing in seeing these corrupt CEOs getting a taste of their own medicine.

I think you’re right about OpenAI. Glad to see that those guys won’t be able to monopolize AI. But what other CEOs besides Sam Altman are you referring to?

I guess Microsoft isn’t so happy about this because of their heavy investment into OpenAI. But they will benefit greatly from the increased efficiencies. Other AGI companies like Google and Anthropic can’t be super happy either, but none of them seem to have aspirations to dominate AI development like OpenAI.

Nvidia should be happy as long as they can sell chips, and more AI means more chips, so they should be perfectly fine with this. In fact, Nvidia recently released their own open source AI model (NVLM) specifically so that more companies would build AI and buy more chips. And Meta is already on the open source train.

Honestly, I can’t really name any companies who are hurt by this apart from OpenAI.
« Last Edit: January 27, 2025, 08:26:29 PM by Herbert Derp »

Fru-Gal

  • Handlebar Stache
  • *****
  • Posts: 2200
Re: what up with this deepseek?
« Reply #15 on: January 27, 2025, 08:28:40 PM »
I’m mainly referring to all the CEOs who’ve created walled gardens for social media that use our data for advertising, AI training and propaganda, but then tell the public that AI is too important for us to have access to. And to all the CEOs who are lining up for Stargate meal tickets, taxpayer funds and tax cuts and will then boast about their intellectual prowess in creating market-leading companies with record profits and obscene executive pay packages.

Nvidia is hurt by this, at least temporarily, because they’ve put a lot of effort into proprietary AI hardware.

Herbert Derp

  • Handlebar Stache
  • *****
  • Posts: 1390
  • Age: 34
Re: what up with this deepseek?
« Reply #16 on: January 27, 2025, 08:44:40 PM »
I’m mainly referring to all the CEOs who’ve created walled gardens for social media that use our data for advertising, AI training and propaganda, but then tell the public that AI is too important for us to have access to.

I’m not sure that this is a 100% accurate take. The number one company, by far, who is involved in using AI for social media, advertising, and propaganda is Meta, but they have embraced open source AI from the start. That leaves Google, but they have been lagging so far behind, and I haven’t heard regulatory capture sentiments coming from them either. The only other AI company involved in this stuff that comes to mind is ByteDance, but they have been completely silent in US AI regulatory politics for obvious reasons.

And to all the CEOs who are lining up for Stargate meal tickets

These guys are happy that they don’t have to invest $500B anymore!

Nvidia is hurt by this, at least temporarily, because they’ve put a lot of effort into proprietary AI hardware.

I still think more AI means more chips, and Nvidia will make out quite fine. Lots of smaller companies will want to buy chips now that they can afford to do what only larger companies used to be able to do.
« Last Edit: January 27, 2025, 08:58:58 PM by Herbert Derp »

Fru-Gal

  • Handlebar Stache
  • *****
  • Posts: 2200
Re: what up with this deepseek?
« Reply #17 on: January 27, 2025, 08:48:05 PM »
Yes, I’m happy that Meta’s AI LLama is open sourced. I’m happy to see these companies compete with each other and force each other to be better and do better.

That said Meta has done a lot of bad as well. There were so many simple things* that could have been done over the last 15 years of social media to moderate it but instead the walled garden CEOs all raced to see who could make the most addictive product. As a result we saw a lot of mental health consequences.

*daily time limits, age limits, propaganda limits, etc.

I agree about Nvidia, but it is really interesting to see them quietly surge to the top and then in one fell swoop the premise for that surge get a major challenge.
« Last Edit: January 27, 2025, 08:52:36 PM by Fru-Gal »

Herbert Derp

  • Handlebar Stache
  • *****
  • Posts: 1390
  • Age: 34
Re: what up with this deepseek?
« Reply #18 on: January 27, 2025, 09:39:04 PM »
It is now being reported that DeepSeek has 50,000 Nvidia H100s, so this breakthrough may not be all that it is hyped up to be:
https://wccftech.com/chinese-ai-lab-deepseek-has-50000-nvidia-h100-ai-gpus-says-ai-ceo/

Fru-Gal

  • Handlebar Stache
  • *****
  • Posts: 2200
Re: what up with this deepseek?
« Reply #19 on: January 27, 2025, 10:24:14 PM »
It is now being reported that DeepSeek has 50,000 Nvidia H100s, so this breakthrough may not be all that it is hyped up to be:
https://wccftech.com/chinese-ai-lab-deepseek-has-50000-nvidia-h100-ai-gpus-says-ai-ceo/

Oh wow, all this drama is fascinating. Nvidia sure gave a non-answer to the question of whether or not export chip controls were circumvented. I can think of two reasons for that, however.

1, If they did export fully functioning chips illegally, of course they don’t want to admit that. If they didn’t break the law, why didn’t they just say that? Why not denounce anyone saying it’s their H100s?
2, If they neither confirm nor deny, it makes a better case for Nvidia investors by casting doubt on the DeepSeek claims.

https://www.nextplatform.com/2025/01/27/how-did-deepseek-train-its-ai-model-on-a-lot-less-and-crippled-hardware/

However, that doesn’t change the fact that the model can be run and trained locally.

Also to answer an earlier question from @mistymoney, I was wrong, it’s not free to use it on the DeepSeek platform. In that case, like most AI tools, you pay per query. But it still puts a dent in ChatGPT’s $200/month pro plan. Unlike ChatGPT, it is open source and free if you download and install on your own hardware.
« Last Edit: January 27, 2025, 10:35:41 PM by Fru-Gal »

MustacheAndaHalf

  • Walrus Stache
  • *******
  • Posts: 7604
  • Location: U.S. expat
Re: what up with this deepseek?
« Reply #20 on: January 28, 2025, 06:51:58 AM »
It is now being reported that DeepSeek has 50,000 Nvidia H100s, so this breakthrough may not be all that it is hyped up to be:
https://wccftech.com/chinese-ai-lab-deepseek-has-50000-nvidia-h100-ai-gpus-says-ai-ceo/

An earlier link shed light on this:

Quote
Scale AI CEO Alexandr Wang said they have 50,000 H100s.

I don’t know where Wang got his information; I’m guessing he’s referring to this November 2024 tweet from Dylan Patel, which says that DeepSeek had “over 50k Hopper GPUs”. H800s, however, are Hopper GPUs, they just have much more constrained memory bandwidth than H100s because of U.S. sanctions.

Here’s the thing: a huge number of the innovations I explained above are about overcoming the lack of memory bandwidth implied in using H800s instead of H100s. Moreover, if you actually did the math on the previous question, you would realize that DeepSeek actually had an excess of computing; that’s because DeepSeek actually programmed 20 of the 132 processing units on each H800 specifically to manage cross-chip communications. This is actually impossible to do in CUDA. DeepSeek engineers had to drop down to PTX, a low-level instruction set for Nvidia GPUs that is basically like assembly language. This is an insane level of optimization that only makes sense if you are using H800s.

Meanwhile, DeepSeek also makes their models available for inference: that requires a whole bunch of GPUs above-and-beyond whatever was used for training.
https://stratechery.com/2025/deepseek-faq/#:~:text=AI%20CEO

FINate

  • Magnum Stache
  • ******
  • Posts: 3402
Re: what up with this deepseek?
« Reply #21 on: January 28, 2025, 07:49:34 AM »
Commoditization.

This story is primarily about commoditization, whereas the tech underneath is just a means to this end.

When this happens it changes who captures the value of technical innovation. If initial reports are accurate, DeepSeek has produced a product with 90% of the value of competitors for 1/10 the cost. I think we need to wait and see more confirmation of this, but if true this is a massive earthquake that has fundamentally shifted the investment landscape.

This will impact VCs/investors the most. It's going to be very tough to get a decent ROI on billions spent building data centers filled with the most advanced GPUs if there's an open source alternative undercutting prices. This is classic Innovators Dilemma stuff.

Nvidia is also hurt by this, though less so than AI investors. Yes, they will still sell a ton of chips. But if good-enough models don't require the latest greatest chips then most of what they sell will be older, less profitable chips. These are less profitable because there are other competitors in this segment which means everyone is competing on price -- again, commoditization.

This also has the potential to be very bad for Silicone Valley. The valley runs on cycles of hype waves, with VCs/investors pumping money into startups, who then IPO or consolidate via M&A, which then dumps money back to investors to then pump the next cycle.  A ton of SV money is currently sunk into AI companies -- if much of this value has been destroyed seemingly overnight (though as others have pointed out, this has been underway for quite some time for those paying attention), then this means a lot less money for future ventures.


reeshau

  • Magnum Stache
  • ******
  • Posts: 3745
  • Location: Houston, TX Former locations: Detroit, Indianapolis, Dublin
  • FIRE'd Jan 2020
Re: what up with this deepseek?
« Reply #22 on: January 28, 2025, 11:03:23 AM »
I agree it's about commoditization, but I don't think it's about the hardware level at all.  DeepSeek used what they had--they never said they shunned the current Nvidia hardware.  They were cut off from it.

If we were in a place where every internet function had AI embedded in it, I might agree that hardware would be topped out.  (and datacenters with it)  But a reduction in cost greatly increases the potential uses of generative AI:  more people can come into the pool.  The cost comes from lower scale needed, not that the older chips are just fine--or cheaper than the newer ones to run.  (Maybe they are, but DeepSeek doesn't prove that.  Since their engine in open source, maybe someone will test that out)

For AI engines, this is a wake-up call, but it's also hardly the end.  Again, as DeepSeek is open source, they can incorporate the better bits in their code, or use it as inspiration for their own research.

To me, this is like the rise of ARM.  Handheld devices weren't on the same power curve as laptops; "I don't care how powerful your processor is, it's crap if I have to recharge my phone after 2 hours!"  So, after decades of chasing pure compute compute speed, companies switched to also incorporate power management.  ARM is still here, and formidable, but hardly eaten everyone's lunch.  (Intel's self-inflicted wounds aside)

Defining a new dimension of measurement is definitely good for users of the technology, and a milestone in its lifecycle.

walkingmiller

  • 5 O'Clock Shadow
  • *
  • Posts: 10
Re: what up with this deepseek?
« Reply #23 on: January 28, 2025, 04:24:02 PM »
Quote
I agree it's about commoditization, but I don't think it's about the hardware level at all.  DeepSeek used what they had--they never said they shunned the current Nvidia hardware. They were cut off from it.

I completely agree. Nvidia is going to continue selling a ridiculous number of chips. The issue is that the markets had priced in an Nvidia monopoly for chips. Nvidia famously has extortionary margins on their chips, and they have been able to sustain these margins because of the perceived need for the best hardware. DeepSeek's research paper and model performance cuts into Nvidia's (and analysts') narrative that hardware innovations will drive AI innovation. If hardware is not the primary driver for innovation, Nvidia will have to shrink its margins because AI companies will shift capital allocation away from hardware.

While DeepSeek's research and models are incredibly impressive, I think this narrative that hardware will drive innovation has been false for a while. We have seen this same pattern happening time and again over the last few years - OpenAI/Google/Anthropic/Meta release frontier models that cost eye-popping sums of money to train. Then, 4 or 5 months later, a second-tier company releases a similar model that costs a fraction of what the first-tier companies spent. The news cycle just skips the news from these second-tier companies because the leading AI companies are so good at keeping the hype going with new releases. If OpenAI's o3 had been more widely accessible and impressive, I don't think this story would have garnered nearly the attention it has. 

So, while this news is a big deal, I think it has more to do with the markets right-sizing in light of information that has been known for quite a while.

rocketpj

  • Handlebar Stache
  • *****
  • Posts: 1230
Re: what up with this deepseek?
« Reply #24 on: January 28, 2025, 05:17:21 PM »
I don't time investments, but I do recall thinking that last week would probably be a good time to sell and hold cash for a few months while the Trump chaos demons did their work.  Of course I did not do so, because what do I know really? (protip- not much).  Had I done so it probably would have been a good trade, ironically for the wrong reasons.

Meh.  Time to look away from my ETF portfolio for awhile.  Probably time to dump a bit extra in, if I find myself with a bit extra...

ChpBstrd

  • Walrus Stache
  • *******
  • Posts: 8170
  • Location: A poor and backward Southern state known as minimum wage country
Re: what up with this deepseek?
« Reply #25 on: January 29, 2025, 06:36:00 AM »
I don't understand the glee about deepseek being open source.

How many people are running Linux right now, on the desktop or phone they are using to read this?
How many people use something other than Google, Siri, or Alexa for search?
How many people are on Mastadon for social media?

The internet is inherently organized around brands, not traditional moats. We all happily use products which psychologically manipulate us to our detriment, which spy on us - even listening to our ambient conversations to identify what we're talking about and geolocating us, which give "sponsored" answers to questions and searches, and which overall deliver a mediocre experience because it's the default of what everyone else is doing. Convenience and the illusion of being "free" seem to be all that matters.

WRT the fantasy of running open source AI on your own device I have to ask how many petabytes of data will you be storing on your home server farm? Who will sell you these data? How much will the processors cost (H800s start at about a grand apiece) and how many hundreds of them will you have? How will you handle the five-figure monthly electric bill to keep all this running? Will your server farm be in an outbuilding, and is the nitrogen fire suppressant system up to code?

One might appeal to the old Moore's Law here and say just give it time and our server farms will become a laptop, but then why are we all still searching with Google over 20 years later, instead of running our own open source tools? Why are >99% of us defaulting to MacOS or Windows, which is increasingly adware, rather than wiping our new computers and installing Linux? Hell, that last move is 100% free and requires minimal technical skill, and yet we won't do it. And this is not even addressing the observation made years ago that Moore's law is breaking down, and will continue to flatline absent some advance like quantum computing. A thousand H800s may never fit into a laptop, for the same reason our airliners don't fly Mach 5 in the year 2025 and for the same reason atomic car engines never worked out. They'll eventually hit an engineering limit.

Even open source tools will eventually be packaged in a convenient app with a brand and ad campaign, preinstalled on new devices, and all the other monopolistic tricks that governments allow in the internet economy. MacOS is, after all, derived from Unix, which tax dollars paid to create. At this point, and after so many consumers have had the opportunity to embrace open source and failed to do so, I cannot imagine something as complex as an AI being the thing we decide to try first.

GuitarStv

  • Senior Mustachian
  • ********
  • Posts: 25364
  • Age: 43
  • Location: Toronto, Ontario, Canada
Re: what up with this deepseek?
« Reply #26 on: January 29, 2025, 07:56:50 AM »
WRT the fantasy of running open source AI on your own device I have to ask how many petabytes of data will you be storing on your home server farm? Who will sell you these data? How much will the processors cost (H800s start at about a grand apiece) and how many hundreds of them will you have? How will you handle the five-figure monthly electric bill to keep all this running? Will your server farm be in an outbuilding, and is the nitrogen fire suppressant system up to code?

A big part of the excitement about deepseek is that it's a cut down AI that doesn't require massive processors, a server farm, or gigantic electric bill.  You can get a decent version running on an OK desktop, or even a high spec laptop.

MustacheAndaHalf

  • Walrus Stache
  • *******
  • Posts: 7604
  • Location: U.S. expat
Re: what up with this deepseek?
« Reply #27 on: January 29, 2025, 08:06:38 AM »
I think ChpBstrd was referring to building one's own models, which requires hundreds of (Nvidia) H800 AI processors.  Searching, I see prices from $22k to $40k for each H800.  Which means a thousand (far less than DeepSeek used) would cost $22M to $40M for the hardware, and still leave you with far fewer processors than DeepSeek used to build their models (they have iterated multiple times, with R1 being the latest).

Is there some intermediate version that runs on normal laptops?  Intermediate between creating a new model with H800 processors, and installing the DeepSeek app to simply send queries to their servers.

MustacheAndaHalf

  • Walrus Stache
  • *******
  • Posts: 7604
  • Location: U.S. expat
Re: what up with this deepseek?
« Reply #28 on: January 29, 2025, 08:19:17 AM »
One might appeal to the old Moore's Law here and say just give it time and our server farms will become a laptop, but then why are we all still searching with Google over 20 years later, instead of running our own open source tools?
Crawling the internet uses I/O bandwidth more than CPU cycles, so it isn't purely Moore's Law.  There's also a separate problem of an arms race.  Customers want quality search results, but abusive advertisers want to get their product inserted anywhere they can.  If you search for Taylor Swift concerts, and an abusive advertiser manages to slip their beauty cream into the results - they get paid for those bad results.  But the customer suffers.  Google has to combat these "black-hat SEO" tactics to keep the quality of its results.


Why are >99% of us defaulting to MacOS or Windows, which is increasingly adware, rather than wiping our new computers and installing Linux? Hell, that last move is 100% free and requires minimal technical skill, and yet we won't do it.
TurboTax Premier (which is on sale right now) is only supported on Windows and MacOS.  It helps me enter K-1 form data correctly, where I lack expertise and confidence.  But it also keeps me trapped on those two operating systems.

I think there's also a divide between people who work with computers, and others.  People with the experience of assembling one or more computers tend to be those who work with computers, more than not.  The idea of switching to Linux presents a lower barrier, especially with concerns over what could go wrong - problem solving is often part of the job.  I could see how most people would rather buy a computer or laptop, and not worry about anything further.

FINate

  • Magnum Stache
  • ******
  • Posts: 3402
Re: what up with this deepseek?
« Reply #29 on: January 29, 2025, 08:40:47 AM »
I don't understand the glee about deepseek being open source.

How many people are running Linux right now, on the desktop or phone they are using to read this?
How many people use something other than Google, Siri, or Alexa for search?
How many people are on Mastadon for social media?

Interesting that you bring up Linux because this is actually a really good example of why people are excited about DeepSeek being open source.

Yes, Linux isn't super common on the desktop (though more than you may realize!). Yet this does not mean that Linux is not all over the place hiding in plain sight.

It's estimated that 70-90% of all internet serving infrastructure runs Linux. Big tech, small tech, and everything in between is running on some flavor of Linux. This is significant because it has greatly decreased the cost of starting a service, and it has provided great flexibility in terms of licensing and modification. The early days of the internet were less dynamic in part due to vendors charging an arm and a leg for servers while engaging in revenue grabs such as forced obsolescence.

Android is based on Linux. The open source nature of it means they were able to start with Linux as a base and modify and evolve it as needed. So in fact a ton of phones are running a variation of Linux.

Chromebooks, which are growing in popularity, are also Linux based.

Many electronic devices also use Linux. E.g. I use a lot of UniFi networking equipment, which is also running Linux.

The excitement of OSS isn't necessarily that everyone's going to download, build, and run it on their own. But rather, it creates an open and collaborative environment where all different types of people and businesses can keep building on the work of others. This does not necessarily mean "free" products, but instead a rich ecosystem of smaller companies building their own models (vs. winner takes all monopolies).

Open source is like a frontier of commoditization that forces corporations to keep adding new value.
 
« Last Edit: January 29, 2025, 08:42:22 AM by FINate »

ChpBstrd

  • Walrus Stache
  • *******
  • Posts: 8170
  • Location: A poor and backward Southern state known as minimum wage country
Re: what up with this deepseek?
« Reply #30 on: January 29, 2025, 09:03:07 AM »
I don't understand the glee about deepseek being open source.

How many people are running Linux right now, on the desktop or phone they are using to read this?
How many people use something other than Google, Siri, or Alexa for search?
How many people are on Mastadon for social media?

Interesting that you bring up Linux because this is actually a really good example of why people are excited about DeepSeek being open source.

Yes, Linux isn't super common on the desktop (though more than you may realize!). Yet this does not mean that Linux is not all over the place hiding in plain sight.

It's estimated that 70-90% of all internet serving infrastructure runs Linux. Big tech, small tech, and everything in between is running on some flavor of Linux. This is significant because it has greatly decreased the cost of starting a service, and it has provided great flexibility in terms of licensing and modification. The early days of the internet were less dynamic in part due to vendors charging an arm and a leg for servers while engaging in revenue grabs such as forced obsolescence.

Android is based on Linux. The open source nature of it means they were able to start with Linux as a base and modify and evolve it as needed. So in fact a ton of phones are running a variation of Linux.

Chromebooks, which are growing in popularity, are also Linux based.

Many electronic devices also use Linux. E.g. I use a lot of UniFi networking equipment, which is also running Linux.

The excitement of OSS isn't necessarily that everyone's going to download, build, and run it on their own. But rather, it creates an open and collaborative environment where all different types of people and businesses can keep building on the work of others. This does not necessarily mean "free" products, but instead a rich ecosystem of smaller companies building their own models (vs. winner takes all monopolies).

Open source is like a frontier of commoditization that forces corporations to keep adding new value.
Maybe you're right on a ecosystem innovation level. But in the end we will all end up paying some company which has packaged up something that was formerly OSS into a convenient package - like Android, Apple's OS, or the various web services (hosting, social media, search, etc...) using Linux on the backend. These can absolutely become monopolies or duopolies and extort a king's ransom from the population. Due to network effects, we can find ourselves locked into what everyone else is using, and that seems to be an inherent feature of the internet.

walkingmiller

  • 5 O'Clock Shadow
  • *
  • Posts: 10
Re: what up with this deepseek?
« Reply #31 on: January 29, 2025, 09:05:04 AM »
Quote
I don't understand the glee about deepseek being open source.

The excitement isn't really about the availability of the model itself but how they built it and the fact that they published their work. This particular model won't be used by many people in a year, but the process they used to train it so efficiently is a major advancement in the world of AI. The major AI companies may not use DeepSeek's methodology because it is probably easier for them to just throw money at compute vs doing the work to optimize training, but many tech companies who don't have unlimited capital will build on what DeepSeek has done or will use DeepSeek on their own GPU's at a fraction of the cost that OpenAI would charge for their reasoning models. 

reeshau

  • Magnum Stache
  • ******
  • Posts: 3745
  • Location: Houston, TX Former locations: Detroit, Indianapolis, Dublin
  • FIRE'd Jan 2020
Re: what up with this deepseek?
« Reply #32 on: January 29, 2025, 11:09:11 AM »
I don't understand the glee about deepseek being open source.

How many people are running Linux right now, on the desktop or phone they are using to read this?

Android is open source.  (Although the major manufacturers wrap their customizations around it)  There were 3 billion active devices, as of Dec. 2024.

Fru-Gal

  • Handlebar Stache
  • *****
  • Posts: 2200
Re: what up with this deepseek?
« Reply #33 on: January 29, 2025, 11:33:02 AM »
I don't understand the glee about deepseek being open source.

How many people are running Linux right now, on the desktop or phone they are using to read this?

Android is open source.  (Although the major manufacturers wrap their customizations around it)  There were 3 billion active devices, as of Dec. 2024.

The Internet runs on open source (the LAMP stack). That’s Linux, Apache, MySQL and Python (I’m simplifying here).

The realtime services we have all become accustomed to in the last 10 years (Netflix, Uber) are based on Kubernetes, which is the open source “operating system” for orchestrating app containers, which is how scaling occurs. The reason why you need something like Kubernetes is that container orchestration happens at a scale well beyond human capacity to manage.

Quote
Open source is like a frontier of commoditization that forces corporations to keep adding new value.

Love how you put this!
« Last Edit: January 29, 2025, 11:35:52 AM by Fru-Gal »

Fru-Gal

  • Handlebar Stache
  • *****
  • Posts: 2200
Re: what up with this deepseek?
« Reply #34 on: January 29, 2025, 12:28:53 PM »


AuspiciousEight

  • Bristles
  • ***
  • Posts: 365
Re: what up with this deepseek?
« Reply #35 on: January 29, 2025, 01:13:39 PM »
I asked chatgpt about deepseek and it gave me a long and pretty detailed explanation about the differences in technology, pros and cons, and various use cases where each one excels over the other one and when you would want to use chatgpt and when you would want to use deepseek, etc.

It didn't seem to show any signs of jealousy or have any particular insecurities about deepseek at all, lol :-p

ChpBstrd

  • Walrus Stache
  • *******
  • Posts: 8170
  • Location: A poor and backward Southern state known as minimum wage country
Re: what up with this deepseek?
« Reply #36 on: January 29, 2025, 02:07:55 PM »
I asked chatgpt about deepseek and it gave me a long and pretty detailed explanation about the differences in technology, pros and cons, and various use cases where each one excels over the other one and when you would want to use chatgpt and when you would want to use deepseek, etc.

It didn't seem to show any signs of jealousy or have any particular insecurities about deepseek at all, lol :-p
Reminds me of the videos where the AI reveals suicidal tendencies.

Fru-Gal

  • Handlebar Stache
  • *****
  • Posts: 2200
Re: what up with this deepseek?
« Reply #37 on: January 29, 2025, 03:21:40 PM »
I can’t get over what a weird timeline we’re living in. Trump comes to power with this absolute bizarre coalition of Christians, seditionists, oilmen, grifters, AI oligarchs, etc. They all stand around, congratulating themselves for securing access to major DOD contracts, and ensuring their next level of insane pay package.

Then China comes out of left field and proves necessity is the mother of invention by training a powerful model on crippled chips, running it much closer to the metal, circumventing the proprietary architectures that Nvidia had put in place with Cuda.

Now, all of the AI CEO’s are freaking the fuck out, but don’t you understand you stupid lazy fucks that you’re in a coalition with fossil fuel? There is no way you can compete if we destroy all of our renewable efforts and neglect to build nuclear. Every single thing that you’ve done in the last five years has been leading to this point of weakness, and the Chinese were just art of warring, waiting for you to fuck up like you did.
« Last Edit: January 29, 2025, 03:23:38 PM by Fru-Gal »

reeshau

  • Magnum Stache
  • ******
  • Posts: 3745
  • Location: Houston, TX Former locations: Detroit, Indianapolis, Dublin
  • FIRE'd Jan 2020
Re: what up with this deepseek?
« Reply #38 on: January 29, 2025, 05:36:03 PM »
I can’t get over what a weird timeline we’re living in. Trump comes to power with this absolute bizarre coalition of Christians, seditionists, oilmen, grifters, AI oligarchs, etc. They all stand around, congratulating themselves for securing access to major DOD contracts, and ensuring their next level of insane pay package.

Then China comes out of left field and proves necessity is the mother of invention by training a powerful model on crippled chips, running it much closer to the metal, circumventing the proprietary architectures that Nvidia had put in place with Cuda.

Now, all of the AI CEO’s are freaking the fuck out, but don’t you understand you stupid lazy fucks that you’re in a coalition with fossil fuel? There is no way you can compete if we destroy all of our renewable efforts and neglect to build nuclear. Every single thing that you’ve done in the last five years has been leading to this point of weakness, and the Chinese were just art of warring, waiting for you to fuck up like you did.

Well, at least it's a stupid conspiracy.  They never show that in the movies.

MustacheAndaHalf

  • Walrus Stache
  • *******
  • Posts: 7604
  • Location: U.S. expat
Re: what up with this deepseek?
« Reply #39 on: January 30, 2025, 07:23:04 AM »
Now, all of the AI CEO’s are freaking the fuck out, but don’t you understand you stupid lazy fucks that you’re in a coalition with fossil fuel? There is no way you can compete if we destroy all of our renewable efforts and neglect to build nuclear. Every single thing that you’ve done in the last five years has been leading to this point of weakness, and the Chinese were just art of warring, waiting for you to fuck up like you did.

These articles from last month suggest you might be misunderstanding the priorities of big tech.

"Google and Amazon invest in small modular reactors to power data centers"
https://spectrum.ieee.org/nuclear-powered-data-center

"Why tech giants such as Microsoft, Amazon, Google and Meta are betting big on nuclear power"
https://www.cnbc.com/2024/12/28/why-microsoft-amazon-google-and-meta-are-betting-on-nuclear-power.html

"Big tech is turning to old reactors (and planning new ones) to power the energy-hungry data centers that artificial intelligence systems need"
https://thebulletin.org/2024/12/ai-goes-nuclear/

GuitarStv

  • Senior Mustachian
  • ********
  • Posts: 25364
  • Age: 43
  • Location: Toronto, Ontario, Canada
Re: what up with this deepseek?
« Reply #40 on: January 30, 2025, 07:45:31 AM »
Now, all of the AI CEO’s are freaking the fuck out, but don’t you understand you stupid lazy fucks that you’re in a coalition with fossil fuel? There is no way you can compete if we destroy all of our renewable efforts and neglect to build nuclear. Every single thing that you’ve done in the last five years has been leading to this point of weakness, and the Chinese were just art of warring, waiting for you to fuck up like you did.

These articles from last month suggest you might be misunderstanding the priorities of big tech.

"Google and Amazon invest in small modular reactors to power data centers"
https://spectrum.ieee.org/nuclear-powered-data-center

"Why tech giants such as Microsoft, Amazon, Google and Meta are betting big on nuclear power"
https://www.cnbc.com/2024/12/28/why-microsoft-amazon-google-and-meta-are-betting-on-nuclear-power.html

"Big tech is turning to old reactors (and planning new ones) to power the energy-hungry data centers that artificial intelligence systems need"
https://thebulletin.org/2024/12/ai-goes-nuclear/
'

Agreed.  Big Tech is all about finding more ways to maximize human use/waste of energy.

ChpBstrd

  • Walrus Stache
  • *******
  • Posts: 8170
  • Location: A poor and backward Southern state known as minimum wage country
Re: what up with this deepseek?
« Reply #41 on: January 30, 2025, 07:46:21 AM »
Now, all of the AI CEO’s are freaking the fuck out, but don’t you understand you stupid lazy fucks that you’re in a coalition with fossil fuel? There is no way you can compete if we destroy all of our renewable efforts and neglect to build nuclear. Every single thing that you’ve done in the last five years has been leading to this point of weakness, and the Chinese were just art of warring, waiting for you to fuck up like you did.

These articles from last month suggest you might be misunderstanding the priorities of big tech.

"Google and Amazon invest in small modular reactors to power data centers"
https://spectrum.ieee.org/nuclear-powered-data-center

"Why tech giants such as Microsoft, Amazon, Google and Meta are betting big on nuclear power"
https://www.cnbc.com/2024/12/28/why-microsoft-amazon-google-and-meta-are-betting-on-nuclear-power.html

"Big tech is turning to old reactors (and planning new ones) to power the energy-hungry data centers that artificial intelligence systems need"
https://thebulletin.org/2024/12/ai-goes-nuclear/
Nuclear is approximately the most expensive method of generating electricity. If the US is going this route, and China is investing heavily in manufactured-at-home solar (in addition to their substantial hydro capabilities), then it seems like China will end up with the cheaper electricity.

Thus, if this supposed struggle for AI industrial leadership turns into a matter of where data centers can operate most cheaply, as it is with crypto-mining, then the place with the cheapest juice will have an advantage. Seen in this light, the US pivot toward expensive energy sources and China's pivot toward cheaper energy sources have strategic ramifications.

My $18k investment in pre-tariff solar panels is looking like a wise decision. If we in the U.S. are seriously thinking about an industrial policy tilted toward higher-cost electrical supply growth to support 24/7 running of thousands of new data centers, then rates will have to go up to pay for it all (no, I do not think the billionaires will pay for it). I generate 60% of my annual demand right now, and could relatively easily add panels to expand on that. Y'all take the expensive route and I'll take the Chinese approach.

MustacheAndaHalf

  • Walrus Stache
  • *******
  • Posts: 7604
  • Location: U.S. expat
Re: what up with this deepseek?
« Reply #42 on: January 30, 2025, 08:00:25 AM »
Now, all of the AI CEO’s are freaking the fuck out, but don’t you understand you stupid lazy fucks that you’re in a coalition with fossil fuel? There is no way you can compete if we destroy all of our renewable efforts and neglect to build nuclear. Every single thing that you’ve done in the last five years has been leading to this point of weakness, and the Chinese were just art of warring, waiting for you to fuck up like you did.

These articles from last month suggest you might be misunderstanding the priorities of big tech.

"Google and Amazon invest in small modular reactors to power data centers"
https://spectrum.ieee.org/nuclear-powered-data-center

"Why tech giants such as Microsoft, Amazon, Google and Meta are betting big on nuclear power"
https://www.cnbc.com/2024/12/28/why-microsoft-amazon-google-and-meta-are-betting-on-nuclear-power.html

"Big tech is turning to old reactors (and planning new ones) to power the energy-hungry data centers that artificial intelligence systems need"
https://thebulletin.org/2024/12/ai-goes-nuclear/
Nuclear is approximately the most expensive method of generating electricity. If the US is going this route, and China is investing heavily in manufactured-at-home solar (in addition to their substantial hydro capabilities), then it seems like China will end up with the cheaper electricity.

Thus, if this supposed struggle for AI industrial leadership turns into a matter of where data centers can operate most cheaply, as it is with crypto-mining, then the place with the cheapest juice will have an advantage. Seen in this light, the US pivot toward expensive energy sources and China's pivot toward cheaper energy sources have strategic ramifications.

My $18k investment in pre-tariff solar panels is looking like a wise decision. If we in the U.S. are seriously thinking about an industrial policy tilted toward higher-cost electrical supply growth to support 24/7 running of thousands of new data centers, then rates will have to go up to pay for it all (no, I do not think the billionaires will pay for it). I generate 60% of my annual demand right now, and could relatively easily add panels to expand on that. Y'all take the expensive route and I'll take the Chinese approach.

Google and Microsoft have electricity costs that are about 1% of their revenue.  For FY2023, that was about 4 days of revenue.
https://www.bestbrokers.com/stock-brokers/big-techs-staggering-power-consumption-calculating-the-massive-electricity-bills-companies-pay-off-with-ease/

maizefolk

  • Walrus Stache
  • *******
  • Posts: 7547
Re: what up with this deepseek?
« Reply #43 on: January 30, 2025, 08:26:40 AM »
It's not clear that AI (at least chatbot-like AI) has any of the same network effects we think about with social networks or marketplaces so I don't think we should be expecting natural monopolies.

Look at how much adoption we're already seeing of facebook's open source llama models in a wide range of different applications (both run or cloud hosted but by a wide range of different providers), and facebooks open source models clearly are not nearly as capable as DeepSeek.

 

Wow, a phone plan for fifteen bucks!