Fully managed, zero ops, serverless.
Setup¶
pip install pinecone-client
import pinecone
pc = pinecone.Pinecone(api_key='key')
pc.create_index(
name='docs',
dimension=1536,
metric='cosine',
spec=pinecone.ServerlessSpec(cloud='aws', region='us-east-1')
)
Operations¶
index = pc.Index('docs')
index.upsert(vectors=[('d1',[0.1,0.2,...],{'text':'PG guide'})])
results = index.query(vector=[0.1,...], top_k=5, include_metadata=True)
- Zero ops
- Serverless
- Metadata filtering
- LangChain integration
Practical Deployment with RAG¶
Pinecone is most commonly used as a vector database for RAG (Retrieval-Augmented Generation) pipelines. The typical workflow involves splitting documents into chunks, generating embeddings using OpenAI or another model, and storing them in Pinecone with metadata (source, date, category).
When querying, you combine vector similarity search with metadata filtering — for example, searching for similar documents but only from the last 30 days. Pinecone’s serverless model means you do not pay for idle capacity and scaling is automatic. Integration with LangChain and LlamaIndex simplifies deployment to just a few lines of code. For sensitive data, consider Pinecone with a dedicated environment and encryption.
Pinecone for Production¶
Simplest managed solution.