What I am trying to build?

2 min readBy Ashutosh Rana
personalraw thoughtsteeterm extraction

Lab Entry #1

   

What I Want to Build

   

I’m building something I call the Term Extraction Engine (TEE).

   

Its job is simple to say, but tricky to do well: find and extract domain-specific terms from a given body of knowledge.

   

Right now, that knowledge might be a PDF textbook, a Word document, or even raw text. Later, it could be any kind of structured or unstructured content. The TEE will take that input, understand the context or domain I’ve specified, and return a clean, precise set of terms.

   

My first target domain is Blockchain and Web3.

   

The extracted terms will feed directly into a dictionary I’m building one that explains blockchain and Web3 concepts in plain, human language. The goal is to make complex ideas feel simple, without dumbing them down.

   

This isn’t about building just another glossary. It’s about creating a research tool that helps people actually learn.

   

Why I’m Building It from Scratch

   

Could I just use an existing term extraction API or NLP service? Sure.
But here’s why I’m not:

   

  • Control: Off-the-shelf tools are generic. They don’t think in terms of blockchain, Web3, or any other niche I care about.

   

  • Cost: Many services charge per request or token. I want this to run locally, for free, without depending on anyone else’s API.

   

  • Precision: I don’t just want any terms. I want the right terms the ones that actually matter for a given domain.

   

  • Learning: Building this from scratch forces me to understand the guts of term extraction the algorithms, heuristics, and trade-offs. This is knowledge I can apply anywhere.

I want a tool that’s mine from the inside out one I can tweak, extend, and evolve without waiting for a vendor to add a feature.

   

Follow me on X where I share my raw thoughts. ↗