Training of Large Language Models by Indian Firms

Syllabus: GS3/ Science and Technology

Context

Bengaluru-based startup Sarvam AI unveiled two indigenous Large Language Models (LLMs), underscoring India’s push for sovereign, multilingual, and compute-efficient AI amid global competition.

Large Language Models (LLMs)

A large language model (LLM) is a type of artificial intelligence (AI) algorithm that uses deep learning techniques and massively large data sets to understand, summarize, generate and predict new content.
Deep learning involves the probabilistic analysis of unstructured data, which eventually enables the deep learning model to recognize distinctions between pieces of content without human intervention.
It helps to understand how characters, words, and sentences function together.

Indigenous LLM Ecosystem in India

Sarvam AI Models: Focus on efficiency, accuracy, and Indian language capabilities. Intended to be open-source, though broader public scrutiny is ongoing.
BharatGen, incubated at IIT Bombay, trained a multilingual 17-billion-parameter model for sectors like education and healthcare.
Gnani.ai launched compact speech and text-to-speech models.

How LLMs Are Trained?

GPU Clusters: LLM training requires massive computational power using clusters of Graphics Processing Units (GPUs). Thousands of GPUs operate simultaneously for weeks or months.
Data as the Core Input: Training relies on enormous datasets, often scraped from the Internet.
Model Parameters: Parameters represent the internal weights through which models learn patterns. Sarvam AI trained models with 35 billion and 105 billion parameters.
- Larger parameter counts improve capability but require more computation.

Key Training Methodologies Used

Data Curation: It focuses on collecting high-quality datasets in Indian languages.
- It includes government documents, literature, media, and synthetic data generation.
- It is critical for improving performance beyond English-centric AI systems.
Pre-Training: The models learn general language patterns by predicting the next token in large unlabelled datasets.
- This stage builds foundational reasoning and grammar capabilities.
Fine-Tuning: Models are adapted for specific tasks using curated datasets.
- Tools such as Hugging Face and LangChain support instruction tuning, classification, and domain adaptation.
Alignment/RLHF (Reinforcement Learning from Human Feedback): Human raters rank model outputs to teach it to be safer, more accurate, and better aligned with human intent, discouraging harmful or biased responses.

Challenges in Training LLMs in India

Limited Indian Language Data: Scarcity of high-quality datasets in Indian languages reduces model performance.
- Many systems rely on translation into English before processing, increasing token usage and latency. Suboptimal native performance affects adoption among non-English users.
High Capital Requirements: Training frontier models demands substantial financial investment. Startups often lack immediate commercial returns to justify such costs.
Infrastructure Constraints: Access to high-end computing facilities remains limited without government support.

IndiaAI Mission

The IndiaAI Mission is the flagship initiative to build a comprehensive, sovereign AI ecosystem for India.
It focuses on developing high-performance computer infrastructure, indigenous foundational models, and safe, ethical AI, under the vision of “Making AI in India and Making AI Work for India”.
India has achieved 38,000 GPUs, providing affordable access to world-class AI resources.
A GPU or Graphics Processing Unit is a powerful computer chip that helps machines think faster, process images, run AI programs, and handle complex tasks more efficiently than a regular processor.

Source: TH

Previous article Energy Imbalance and the Changing Dynamics of El Niño

Next article Blockchain Based Digital Governance

Other News of the Day

Syllabus: GS1/Geography Context A recent study highlights that the 2022 increase in Earth’s energy imbalance was largely driven by a shift from a “triple-dip” La Niña to a warm El Niño, combined with long-term climate change. Earth’s Energy Imbalance Study Earth’s Energy Imbalance (EEI) refers to the difference between incoming solar radiation and outgoing heat...
Read More

Syllabus: GS3/Science and Technology Context The Blockchain India Challenge, launched by the Ministry of Electronics & Information Technology (MeitY) is a national initiative aimed at encouraging visionary Indian startups to pitch & pilot cutting-edge Blockchain-based digital governance solutions. What is Blockchain? Blockchain is a distributed, transparent, secure, and immutable database that functions like a ledger...
Read More

Syllabus: GS3/Environment In News Recently, it has been observed that Carbon Capture and Utilisation (CCU) technologies are essential for achieving India’s net-zero emissions targets, particularly for hard-to-abate sectors like cement. Carbon Capture and Utilisation (CCU) It refers to a set of technologies that capture carbon dioxide emissions from industrial sources or directly from the air...
Read More

RAMP Programme Syllabus: GS2/Governance Context The 5th National MSME Council has reviewed the progress of the World Bank supported RAMP Programme. Raising and Accelerating MSME Performance (RAMP) RAMP is a World Bank supported Central Sector Scheme aimed at improving access of MSMEs to market, finance and technology upgradation by enhancing the outreach of existing MoMSME...
Read More