Hacker Newsnew | past | comments | ask | show | jobs | submit | codekisser's commentslogin

what place do vector-native databases have in 2025? I feel using pgvector or redisearch works well and most setups will probably be using postgres or redis anyway.


Philip here from the Chroma engineering team.

Chroma supports multiple search methods - including vector, full-text, and regex search.

Four quick ways Chroma is different than pgvector: Better indexes, sharding, scaling, and object storage.

Chroma uses SPANN (Scalable Approximate Nearest Neighbor) and SPFresh (a freshness-aware ANN index). These are specialized algorithms not present in pgvector. [1].

The core issue with scaling vector database indexes is that they don't handle `WHERE` clauses efficiently like SQL. In SQL you can ask "select * from posts where organization_id=7" and the b-tree gives good performance. But, with vector databases - as the index size grows, not only does it get slower - it gets less accurate. Combining filtering with large indexes results in poor performance and accuracy.

The solution is to have many small indexes, which Chroma calls "Collections". So, instead of having all user data in one table - you shard across collections, which improves performance and accuracy.

The third issue with using SQL for vectors is that the vectors quickly become a scaling constraint for the database. Writes become slow due to consistency, disk becomes a majority vector indexes, and CPU becomes clogged by re-computing indexes constantly. I've been there and ultimately it hurts overall application performance for end-users. The solution for Chroma Cloud is a distributed system - which allows strong consistency, high-throughput of writes, and low-latency reads.

Finally, Chroma is built on object storage - vectors are stored on AWS S3. This allows cold + warm storage tiers, so that you can have minimal storage costs for cold data. This "scale to zero" property is especially important for multi-tenant applications that need to retain data for inactive users.

[1] https://www.youtube.com/watch?v=1QdwYWd3S1g


I wonder how chroma collections compares to using Postgres partitioning. I haven't done this personally, but you should theoretically be able to add a `PARTITION BY collection_name` to have the same effect as sharding between chroma collections.


object storage here do you mean the recent released S3 Vector?


We use S3, but not S3 Vector.


if you want or need to optimize for speed, cost, scalability or accuracy.

dedicated solutions have more advanced search features enable more accurate results. search indexing is resource intensive and can contend for resources with postgres/redis. the cost and speed benefits are naturally more pronounced as data volume scales.

for example - chroma has built in regex+trigram search and copy-on-write forking of indexes. this feature combo is killer for the code-search use case.


What other local databases did you consider, and why did you choose DuckDB? Google tells me a lot of other people stream CDC pipelines to DuckDB, but I'm not familiar enough with it to know what makes it such a compelling choice.


I wanted to start with duckdb since it's really an incredibly powerful tool that people should try out. The performance you can get on analytical queries running on your local compute is just really impressive. And with snowflake streams you can actually stream live data into it without changing anything about your existing data. On why not other databases, I wanted to focus on OLAP to start as there are already other great tools like DLT that help you load data from OLTP sources like postgres and mysql to OLAP sources already, but OLAP to OLAP is pretty rare.

Have you run into a use case for streaming data between data warehouses yourself yet? If so which warehouses?


My university's dorms suffered from no AC, black mold, cockroaches, and yearly floods. I couldn't imagine if teachers and TAs had to live in such slums in addition to being given such inadequate pay. Providing housing is just treating a symptom - universities should fix this by paying their staff enough so they can even afford rent.



I think it's a pretty odd choice to make capital letters and ascenders tall in the name of legibility, especially for programming fonts. Much like bolding text, making capital letters relatively larger will make them stick out more in the code. But, do we need capital letters to be more noticeable?

* for ALL_CAPS constants, there's really no reason why constants should stick out above the rest of the code. In fact, as an invariant, it's probably less important when reading code. * If you mix up PascalCase and camelCase, your linter or compiler will most likely catch it

Maybe capital letters should blend in more so syntax highlighting and "code shape" (eg indentation) will dominate --- which are much more important for understanding code at a glance than how we chose to arbitrarily capitalize some names.


When something similar happened to popular youtube channel Linus Tech Tips, the scammers gained a few thousand

https://www.reddit.com/r/LinusTechTips/comments/11zhr9n/the_...


more like hundreds of thousands of dollars, and unlike ransomware, the FBI and other law enforcement does not care, so it goes under the radar especially when less famous channels are hacked.


The LTT hacker ran the classic fake Elon Musk crypto livestream that scammers have been rehashing for years. It probably says something about the kind of person who reveres Musk that they're apparently known to be marks who will easily fall for the wallet inspector routine.


So you're saying scammers target Musk adherents for the same reason Nigerian Prince emails are rife with spelling errors.


Yes, targeting people who are oblivious to the blindingly obvious is a fantastic strategy for scammers.


I develop AI girlfriends. I've struggled a lot with achieving natural-feeling recall. I've gone through a few iterations, but I'm currently experimenting with a knowledge graph/vector hybrid that uses an LLM to extract facts to build the graph. Both the performance and $ cost really hurt, but it truly does breathe life into AI characters. I've seen a lot of the commercial products using the latest and most expensive models in the cloud to extract facts. Instead, I fine-tuned a local model on a gpt4-generated dataset, and it works surprisingly well. It will miss some connections but in practice I don't think it will be too noticeable.


Do you find you really need that level of “resolution” with memories?

On our [1] chatbots we use one long memories text field per chatbot <-> user relationship.

Each bot response cycle suggests a new memory to add as part of its prompt (along with the message etc)

Then we take that new memory and the existing memories text and feed it to a separate “memory archivist” LLM prompt cycle that’s tasked with adding this new memory and resummarizing the whole thing, yielding a replacement for the stored memories, with this new memory added.

Maybe overly simplistic but easy to manage and pretty inexpensive. The archiving part is async and fast. The LLM seems pretty good sussing out what’s important and what isn’t.

[1] https://Graydient.ai


I have already tried what you're doing, and it didn't perform well enough for me. I've been developing this project for a two years now. Its memory isn't going to fit in a single prompt.

I imagine that your AI chatbots aren't as cheap or performant as they can be with your potentially enormous prompts. Technical details aside, just like when talking to real people, it feels nice when they recall minor details you mentioned a long time ago.


If it's your personal assassinate and is helping you for months it means pretty fast it will start forget the details and only have a vogue view of you and your preferences. So instead of being you personal assassinate it practically cluster your personality and give you general help with no reliance on real historical data.


The law is about "digital forgeries" in general - trying to pass off any computer-generated fake as authentic - but they're marketing this as just the "deepfake porn law" for voter popularity.


I think nothing less than a watermark or caption will suffice, but this is probably a problem that will have to be answered in courts. I can't believe they would pass a law with such ambiguity. What do these people do all day?


If you read the actual law, it looks like this law pertains to "digital forgery" - computer-generated material that "falsely appears to be authentic". Does this mean that AI-generated porn is OK as long as it is explicitly labelled?

South Carolina is currently in a funny state where it's illegal to distribute non-consensual AI-generated porn of someone, but there aren't any laws against non-consensual revenge porn. Good news is that revenge porn laws have been adopted almost country-wide.


That is explicitly addressed in the bill text:

> regardless of whether the visual depiction indicates, through a label or some other form of information published with the visual depiction, that the visual depiction is not authentic


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: