
CogCache – LLM Caching & Cost Optimization
CogCache works as a proxy between your AI applications and your LLMs, accelerating content generation through caching results, cutting costs and speeding responses by eliminating the need to consume tokens on previously generated content.
Tech used
- React
- Typescript
- Python
- FastAPI
Achievements
Engineered dashboards and APIs for a caching proxy that cut LLM inference costs by 50%. Worked across React frontend and FastAPI backend to deliver analytics and volume tracking.