Replies: 1 comment
-
|
Actually on reading further it seems like high throughput delete operations are not recommended. So I don't think this would work for us. 😢 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I’m currently evaluating a migration of our OpenSearch cluster, which contains approximately 1 billion documents distributed across 40,000 tenants. The primary motivation for this migration is cost optimization.
At the moment, OpenSearch keeps all data live at all times. Since each tenant has a distinct access pattern, I believe an architecture that allows us to load or activate tenant data only when needed could significantly reduce costs for our users. Additionally, our usage is heavily concentrated between 9 a.m. and 5 p.m., so the ability to scale compute resources based on demand would further improve cost efficiency.
Based on my understanding of this architecture, it seems that both of these goals may be achievable. However, I would appreciate expert suggestions on the best way to use Quickwit in our case:
A) Single index for all tenants
B) One index per tenant
C) An index pool model
In my local tests I noticed option B performs better but it was only for 100 tenants. I wonder what happens when you reach 40k tenants. Opensearch for instance does not allow you to create that many index per tenant it just breaks with the overhead on index management.
Our tenants can sign up and leave at any time, and we do not archive data. Our two primary search patterns are:
• A quick search on two fields across 13 document types
• A full-text search across all fields
I would greatly appreciate your recommendations on the most suitable approach for our workload.
Beta Was this translation helpful? Give feedback.
All reactions