CogCache – LLM Caching & Cost Optimization

CogCache works as a proxy between your AI applications and your LLMs, accelerating content generation through caching results, cutting costs and speeding responses by eliminating the need to consume tokens on previously generated content.

Tech used

React
Typescript
Python
FastAPI

Achievements

Engineered dashboards and APIs for a caching proxy that cut LLM inference costs by 50%. Worked across React frontend and FastAPI backend to deliver analytics and volume tracking.

CogCache – LLM Caching & Cost Optimization

Tech used

Achievements

Links