Welcome!

BUILDING systems THAT WORK

I build backend systems and AI applications with a focus on reliability and performance. From architecting workflows and agents to optimizing MCTS for board games, I value code that is readable, modular, and built to last. These days, I’m looking to bring my hunger for learning and my "build it right" mindset to a professional engineering team.

ENGINE NEGAMAX_V3

STRATEGY ALPHA_BETA_PRUNING

NODES_EVAL 0

STATUS OPTIMIZING_SEARCH...

Scroll to explore

01 — Work experience

BHASIKA AI

FastAPILangGraph PGVectorSentenceTransformers StreamlitLocustAzure VPS

Built and deployed a production-grade RAG chatbot for study-abroad counseling, designing a router workflow with retrieval grading, user profile extraction, and resilient fallback handling for LLM calls.
End-to-end ingestion and retrieval pipeline over PostgreSQL + PGVector — chunking, embedding, and efficient semantic search across 3,100+ documents.
Implemented a hybrid (keyword + semantic) citation validator with clickable source-linked outputs and localized currency normalization for NPR.
Optimized API reliability via async I/O, connection pooling, and multi-worker Gunicorn. Load-tested with Locust; deployed via systemd on Azure VPS.

Introduction

As my latest professional venture at Hobes Tech, I was responsible for architecting and building a production-grade RAG-based chatbot for an Education Consultancy. This project was a trial by fire that tested my resourcefulness — fitting a high-performance ingestion pipeline into a constrained VPS and maintaining reliability despite model provider failure.

Purpose and Goal

In study-abroad consulting, trust is the primary currency; counselors bear responsibility for a student's entire future. We couldn't afford hallucinations. My goal was to ground every response in verifiable facts by implementing a robust citation system where every claim includes a clickable link directly to the source document.

Spotlight

Citation Engine: To solve the attribution problem, I implemented a post-processing hybrid correction algorithm inspired by recent research (e.g., Enhancing RAG Accuracy Through Post-Processing Citation Correction). By combining semantic similarity with keyword matching, the system dynamically drops or reassigns citations to ensure the AI isn't just guessing where it found the info.
Performance Optimization: Users expect instant replies, but a workflow involving an orchestrator, retriever, reranker, and currency localizer can get heavy. I optimized the Time to First Token (TTFT) to under one second on average by:
- Adding a path for FAQs in the workflow that entirely skips the retrieval and reranking route.
- Moving heavy ingestion tasks to background thread pools.
- Using Async I/O for all network-bound operations.
- Implementing Connection Pooling for the database.
- Deploying via Gunicorn with multiple workers to handle concurrent traffic.
Currency Localization: In order to improve User Experience, I implemented a mechanism to find any mentions of currency or monetary information in the chatbot response, and perform on-the-fly conversion to Nepali currency with the proper formatting.

Current Status

It's in production and is being used regularly by students.

Lessons Learned

This project taught me the production gap — the difference between a demo and a system that survives real-world constraints. I learned the trade-offs between agentic vs. deterministic workflows and how to build failure-tolerant systems when upstream APIs are unreliable.

02 — Projects

BAGCHAL AI AGENT

PythonPygame MCTSNegamax Numba

Two competitive AI agents for the traditional Nepali board game Bagchal. A pruning Negamax agent with tuned heuristics goes head-to-head with a research-inspired MCTS agent. Strong strategic play, engine published for anyone exploring game AI.

Introduction

I built a complete game engine for Bagchal, the traditional Nepali board game, and developed AI agents using Negamax and Monte Carlo Tree Search (MCTS). What started as a final year project for college, turned into a deep dive into game playing AI, search algorithms, and performance optimization.

Purpose and Goal

Bagchal is a unique, asymmetric game with very little resources for AI development. I wanted to see if I could build an agent that could outplay me and hopefully other existing agents. I had to figure out many of the core concepts — from how to represent the board and moves, to how to design agents for a skewed game.

Spotlight

The biggest challenge was that standard approaches didn't work cleanly with Bagchal. I read through what sparse resources were available. One old paper published in the editorial collection Games of No Chance (Vol.3), was very helpful in understanding the game theory side of things, but it didn't provide any specific guidelines for building AI agents.

Alpha-Beta Search Optimization: I started out with a simple Alpha-Beta Search (Negamax variant) and progressively added optimizations like Iterative Deepening, Transposition Tables, Principal Variation Search (PVS), Move Ordering with Killer and History Heuristics and so on.
Free 8x Performance Gain: I implemented canonicalization of states based on symmetry (Bagchal's unique board has the same symmetry group as that of a square). This allowed me to reduce the overall search space by almost 8x with very little overhead.
Monte Carlo Tree Search (MCTS) Enhancements: Standard MCTS rollouts are random, which failed here because the game's asymmetry heavily biased random play toward the Tigers. I solved this by swapping the random rollout for a shallow Alpha-Beta search, providing a much cleaner signal for the agent's decision-making.
Numba JIT Compilation: Since MCTS performance depends on the number of simulations and that of Alpha-Beta Search on the depth, I used Numba (JIT compilation) to speed up slowest parts of the code. This allowed me to reach a reasonable number of simulations and search depth within a 1-second decision window.

A lot of the work involved experimenting, failing, and rethinking assumptions.

Current Status

The project is Open Source and freely available on GitHub for everyone.

Lessons Learned

This project forced me to go beyond toy projects and understand:

How search algorithms behave in practice
Trade-offs between different AI approaches
How to optimize systems under constraints
How to work through problems without clear guidance

It was the first time I built something that required both theory and engineering depth.

What's Next

I initially aimed to build an AlphaZero-style agent but wasn't able to complete it due to limited resources and experience at the time. I plan to revisit this and approach it from first principles with a better understanding of system design and ML.

K SUNIRA? PARTY MUSIC PLAYER

FastAPISQLAlchemy Next.jsDockerWebSocket

Real-time listen together platform — synchronized playback, collaborative queues, cross-device control via QR. WebSocket-driven live updates, YouTube search, queue voting and priority. Low-latency across LAN with multiple simultaneous devices.

Introduction

Built for game nights and such. Guests join a room, vote on the queue, and playback stays in sync across devices on the same network.

Purpose and Goal

This was born out of a personal pain point: whenever friends gathered, we’d have three different phones trying to play music at once. I wanted to solve the "aux cord battle" in a democratic fashion, where the room acts as a single synchronized player that everyone can influence.

Spotlight

Sync Challenge: Ensuring that player controls (play/pause/seek) and volume remained perfectly aligned across diverse devices was a core hurdle. I built a WebSocket-driven system that treats the server as the source of truth while relaying events between devices.
State Persistence: I implemented a session management system that retains Room IDs and caches YouTube search results, ensuring that if a user’s browser refreshes or they temporarily lose signal, they step right back into the live experience without friction.

Current Status

I still use it every now and then, whenever I can. I've recognized a few issues which I'm planning to solve in a mini-refactor.

Lessons Learned

I learnt that real-time UX is less about raw throughput and more about graceful recovery. Making sure that the app behaves exactly how the users expect despite network issues is important for a seamless UX.

GUFFGAFF CHAT APP

DjangoDjango Channels RedisDocker

Full real-time messaging platform — friends system, group chats, typing indicators, online/offline status, toast notifications, and file-sharing up to 10MB. Redis channel layer for instantaneous delivery. Private and group chat that actually feels instant.

"My favorite part of a project isn't the 'Aha!' moment when it finally runs — it's the three hours of hair-pulling frustration right before it. That’s where the actual engineering happens"

— Prabesh Subedi

03 — Technical stack

Languages

PythonJavaScript CHTML / CSS

Web & Backend

FastAPIDjangoFlask

Databases

PostgreSQLMongoDB RedisPGVector

DevOps & Tools

DockerGit LinuxBash Gunicornsystemd

AI / ML

LangGraphSentenceTransformers RAGMCTSNegamax

Tools should serve the engineer, not the other way around. I focus on the fundamentals so that I can pivot to any tool without losing sight of the underlying architecture

Education

Tribhuvan University

B.Sc. Computer Science & Information Technology — March 2026

Coursework

DSADatabase Systems Operating SystemsDistributed Systems AI & MLWeb Dev

LET'S
talk
WORK.

Open to full-time engineering roles, contract work, and interesting side-projects. Drop a line.

04 — Get in touch

BUILDING systems THAT WORK

Introduction

Purpose and Goal

Spotlight

Current Status

Lessons Learned

Introduction

Purpose and Goal

Spotlight

Current Status

Lessons Learned

What's Next

Introduction

Purpose and Goal

Spotlight

Current Status

Lessons Learned

Introduction

Purpose and Goal

Spotlight

Lessons Learned

LET'StalkWORK.

LET'S
talk
WORK.