Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>Interesting, the pacing seemed very slow when conversing in english, but when I spoke to it in spanish, it sounded much faster

So did you run the model offline on your own computer and get realtime audio?

Can you tell me the GPU or specifications you used?

I inquired with ChatGPT:

https://chatgpt.com/share/68d23c2c-2928-800b-bdde-040d8cb40b...

It seems it needs around a $2,500 GPU, do you have one?

I tried Qwen online via its website interface a few months ago, and found it to be very good.

I've run some offline models including Deepseek-R1 70B on CPU (pretty slow, my server has 128 GB of RAM but no GPU) and I'm looking into what kind of setup I would need to run an offline model on GPU myself.



> So did you run the model offline on your own computer and get realtime audio?

At the top of the README of the GitHub repository, there are a few links to demos where you can try the model.

> It seems it needs around a $2,500 GPU

You can get a used RTX 3090 for about $700, which has the same amount of VRAM as the RTX 4090 in your ChatGPT response.

But as far as I can tell, quantized inference implementations for this model do not exist yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: