Christmas Cap

Data Science Project-Based Winter Training Program

Offline Course
sale ribbon
course-thumbnail
interested count1k+ interested Geeks

Build a real-world Search Engine from scratch with our 15-day data Science Project-based course. Go end-to-end from Semantic Search, RAG implementation, Recommendation Systems, Elasticsearch, GenAI Chatbot,  and much more with final deployment on Google Cloud Platform. 

levelBeginner to Advancedcourse duration2 Weeksseats-left50 Seats Left
interested count1k+ interested Geeks
warning

Make the most of your Winter Break
Enroll, Build Projects, Enhance Skills! 
For further queries, reach us via Call/WhatsApp on: +91-8130806418

Offline Locations

Course Overview


Distributed Architecture
IBM Certification
Earn an industry-recognized IBM Certificate on successful completion.

Python Foundations
Search Engine
Engine with Pydantic schemas, Data Ingestion, Preprocessing, Chunking and TF-IDF baseline.
Distributed Architecture
Semantic Search
Embeddings using Word2Vec / GloVe and Sentence Transformers, Vector Database like Pinecone.
Machine Learning Core
RAG (Retrieval-Augmented Generation)
Implement RAG using smart Chunking, retrieval, and Ranking Algorithms for grounded answers.
Feature Engineering & Ensemble Methods
GenAI Chatbot
LLM workflow (OpenAI/Gemini), Elasticsearch, BM25, and Hybrid Search.
FastAPI, UI & Deployment
Recommendations System
System using embedding similarity, Ranking Algorithms to surface relevant content suggestions.

Note: This 15-day, project-based course will not focus on detailed theory, instead, it is fully centered on building major and minor projects.
It explains only the key concepts used throughout the project building process. This helps learners strengthen their resume and stand out with a stronger profile.

Read more

Course Content

01Day 1 — Problem Understanding + Clean Architecture
  • Define the project scope
  • Draw a simple workflow
  • Set up repo structure with clean modules and clear boundaries
  • Create Pydantic models for Article, Query, and Response

Tools: Python, GitHub, Draw.io, Pydantic.

02Day 2 — Data Ingestion + Storing Articles
  • Load sample articles.
  • Store raw articles in MongoDB.
  • Add basic deduplication.
  • Write PyTest tests for ingestion flow and duplicate detection

Tools: MongoDB, SQLAlchemy (optional), PyTest, Logging.

03Day 3 — Text Cleaning & Preprocessing
  • Apply lowercasing, punctuation cleanup, and stopword removal
  • Add stemming or lemmatization
  • Extract metadata like title and tags
  • Validate cleaned schemas (Pydantic and test preprocessing)

Tools: nltk, spacy, re, Pydantic, PyTest.

04Day 4 — Keyword Search (Baseline Search Engine)
  • Build TF-IDF vectors (scikit-learn)
  • Implement cosine similarity
  • Create API route
  • Add TF-IDF scoring-based ranking and verify accuracy

Tools: scikit-learn, NumPy, FastAPI, PyTest.

Read more

What Sets Us Apart

IBM Certification

IBM Certification

Earn an industry-recognized IBM Certificate after successfully completing the program
24 X 7 Doubt Support

24 X 7 Doubt Support

AI Chat Support for instant doubt resolution, plus a dedicated Teaching Assistant exclusively assigned to your batch.
Course Benefits

Course Benefits

1-year Access to the online Course materials and Premium Recorded videos.
Mentor Sessions

Mentor Sessions

Get 15 days of direct guidance from an industry expert, feedback on your approach, and career pointers.

Upcoming Batches

Batch
Mentor
STARTING FROM
TIMINGS

Frequently Asked Questions

01

What prerequisites do I need to join this Data Science course?

02

Do I get doubt support?

03

Do I receive an internship certificate after completing this course?

04

Can I make the payment through PayPal?

05

Is there any number to contact for query?

06

How long will I get access to the online course material available with this course?