Project image 1

CogCache  LLM Caching & Cost Optimization 

CogCache works as a proxy between your AI applications and your LLMs, accelerating content generation through caching results, cutting costs and speeding responses by eliminating the need to consume tokens on previously generated content.

Tech used

  1. React
  2. Typescript
  3. Python
  4. FastAPI

Achievements

Engineered dashboards and APIs for a caching proxy that cut LLM inference costs by 50%. Worked across React frontend and FastAPI backend to deliver analytics and volume tracking.