India has quietly built something significant. While global conversations about AI infrastructure tend to orbit around OpenAI, Hugging Face, or Google DeepMind, the Government of India launched a sovereign alternative in March 2025, AIKosh, the country’s first national AI datasets and models platform. For innovators, researchers, and patent strategists operating in the Indian technology landscape, AIKosh is not merely a government initiative. It is a structural shift in how AI innovation is resourced, shared, and, critically, how intellectual property in this space will be generated and contested.
What Is AIKosh?
AIKosh, formally known as AIKosha: IndiaAI Datasets Platform is a centralised, government-operated repository of AI datasets, pretrained models, and development tools. It was officially launched on 6 March 2025 by Union Minister Ashwini Vaishnaw under the Ministry of Electronics and Information Technology (MeitY), as a core pillar of the broader IndiaAI Mission.
The platform is built and maintained by IndiaAI, an Independent Business Division within Digital India Corporation, with technical development executed in partnership with NeGD (National e-Governance Division) and Daffodil Software. It is publicly accessible at aikosh.indiaai.gov.in.
The analogy to Hugging Face is deliberate and instructive. Like Hugging Face, AIKosh provides a searchable, structured repository of AI models and datasets that developers can discover, download, and build upon. Unlike Hugging Face, it is a state-operated infrastructure initiative with explicit sovereignty objectives, designed to reduce India’s dependence on synthetic and foreign-origin training data, and to ensure that AI models developed for Indian use cases are trained on verified, India-centric inputs.
Scale and Growth: By the Numbers
The growth trajectory of AIKosh since its formal launch has been notable. Here is what verified government sources and independent trackers confirm as of early 2026:
- 10665+ datasets hosted across 20 sectors
- 300 AI models available on the platform with about 441 organizations
- 1,80,20,260 visits recorded
- 20 sectors covered, including healthcare, agriculture, education, governance, climate, and assistive technologies
The platform started with over 300 datasets and 80+ models at launch. Within months, dataset volume grew more than 30-fold from 300 to 10K+ reflecting aggressive onboarding of institutional contributors.
Contributors include academic institutions (IIT Bombay, IIT Madras, AI4Bharat), private companies (Sarvam AI, Ola Krutrim, Gnani AI), and government ministries. IIT Bombay alone released 16 datasets on AIKosh in June 2025, positioning itself as one of the platform’s largest academic contributors.
The Seven-Pillar Architecture: Where AIKosh Fits
AIKosh is one of seven strategic pillars of the IndiaAI Mission, which was approved by Cabinet in March. The seven pillars are:
- IndiaAI Compute: Subsidised GPU access; over 38,000 GPUs onboarded
- IndiaAI Datasets Platform (AIKosh): National AI data and model repository
- IndiaAI Application Development Initiatives: Sector-specific AI solutions
- IndiaAI FutureSkills: AI skilling; 570 AI Data Labs and 27 IndiaAI labs across states
- IndiaAI Research: Academic AI research funding and infrastructure
- IndiaAI Startups: Startup acceleration; 12 startups selected across two phases including Sarvam AI, Soket AI, Gnani AI, BharatGen (IIT Bombay), and Fractal Analytics
- IndiaAI Safe and Trusted AI: Ethical AI governance and compliance frameworks
Sovereign Models: The Strategic Core
What distinguishes AIKosh from a passive data repository is its role in fostering sovereign foundation models, large AI models trained on Indian data, developed by Indian institutions, and capable of outperforming foreign models on Indic-language tasks.
The Government has funded four startups specifically to develop sovereign foundation models: Sarvam AI, BharatGen, Gnani AI, and Soket AI. Models from all four were formally launched at the IndiaAI Impact Summit 2026 in February 2026.
A March 2026 Rajya Sabha statement confirmed that models developed by Sarvam AI “have demonstrated high accuracy in document understanding and Indic language processing” and “in some cases perform better than leading frontier models for Indian language tasks.”
Key models now hosted on AIKosh include:
- Text-to-Speech (TTS) models in Bengali, Gujarati, Kannada, and Malayalam
- IndicVoices: a 12,000-hour multilingual speech dataset covering 22 Indian languages and 208 districts, developed jointly by IIT Madras, AI4Bharat, and Sarvam AI
- BhasaAnuvaad: the largest speech translation dataset for Indian languages, covering 44,400 hours of audio across 13 languages, launched by AI4Bharat
- Sovereign large language and multimodal models from Sarvam AI and BharatGen
The AI sandbox environment on AIKosh allows researchers to train and experiment with these models using secure, API-based access without requiring independent cloud infrastructure.
Who Can Contribute and How
AIKosh operates a structured contributor model that is directly relevant for organisations considering IP strategy around their data assets.
The platform defines four user roles: Explorer, Contributor, Organisation Admin, and Platform Admin. Only institutional entities startups, corporates, academic institutions, statutory bodies, and non-profits are eligible to contribute. Individual submissions are explicitly prohibited.
Each contributing organisation must designate an authorised representative (Administrative SPOC) and comply with the Digital Personal Data Protection Act (DPDPA), the National Data Sharing and Accessibility Policy (NDSAP), and all applicable Government of India data governance frameworks. Datasets must be non-personal, India-specific, and anonymised.
Critically, each dataset and model on AIKosh is governed by permission settings either Open (downloadable by all registered users) or Restricted (discoverable but requiring explicit contributor approval before download). This permission architecture has direct implications for how contributed assets are protected and controlled.
The IP Implications: What Patent Consultants Need to Understand
The emergence of AIKosh as a large, government-curated repository of AI models intersects with a rapidly evolving Indian patent landscape in ways that organisations cannot afford to overlook.
- AI Patent Filings Are Surging
India is now the fifth-largest jurisdiction globally for AI patent filings. Between 2010 and 2025, over 86,000 AI-related patents were filed in India more than 25% of all technology patents in the country. Between 2021 and 2025 alone, AI patent filings were seven times higher than the cumulative filings from 2010 to 2015.
Machine Learning remains the dominant technique, comprising over 55% of AI patent filings. Generative AI accounts for 50% of all ML-related patents filed in India and at 28% of India’s total AI patent portfolio, India ranks among the top five countries globally in GenAI patents, despite representing only 6% of global GenAI activity.
- The CRI Guidelines 2025 Open the Door for AI Patents
The CGPDTM’s Guidelines for Examination of Computer Related Inventions (CRI), 2025, published on 29 July 2025, represent the most consequential regulatory development for AI patenting in India in a decade.
The 2025 CRI Guidelines codify “technical effect,” “technical advancement,” or “technical contribution” as the primary test for AI/ML/DL inventions seeking to overcome the Section 3(k) exclusion (which bars patents on mathematical methods, algorithms, and computer programs per se). The shift is significant: the question is no longer whether an invention involves algorithms, but whether it produces a concrete, implementable technical improvement. AI innovations embedded in AIKosh from speech synthesis to document classification can now credibly meet this threshold.
- Open-Source Models and Prior Art Risk
The open-access nature of AIKosh creates a direct prior art consideration. When a model is published on AIKosh either under an Open or Restricted permission it becomes part of the publicly disclosed prior art landscape. Any patent application that claims innovations substantially covered by existing AIKosh-hosted models faces a novelty challenge.
Conversely, organisations that contribute models to AIKosh before filing patent applications risk inadvertently disclosing their inventions. India does not provide a universal grace period for inventor disclosures before filing making the sequence of contribution versus patent filing a material strategic decision.
- Ownership of AI-Generated Output Remains Unresolved
India’s current IP law, the Patents Act, 1970, and the Copyright Act, 1957 does not recognise AI as an inventor or author. This creates a gap: if an organisation uses AIKosh-hosted models to develop a novel application, who owns the resulting IP? The contributing model developer? The downstream innovator? The platform operator?
These questions do not yet have settled answers in Indian jurisprudence. What is clear is that organisations building on AIKosh-hosted assets should establish clear contractual ownership frameworks before development begins and should ensure that inventive contributions of human engineers are clearly documented for patent prosecution purposes.
- AIKosh Could Reshape India’s AI Prior Art Landscape
Beyond its role as a national AI infrastructure initiative, AIKosh may gradually emerge as one of the most important prior art repositories in India’s AI ecosystem. As more datasets, pretrained models, fine-tuned systems, and multimodal architectures become publicly discoverable through the platform, AIKosh could significantly influence how novelty and inventive step are evaluated for future AI patent applications.
The platform also introduces emerging questions around derivative model ownership and downstream commercialization rights. If developers fine-tune AIKosh-hosted models or build applications on top of sovereign datasets, the boundaries of ownership, licensing rights, and patent eligibility may become increasingly complex. These issues remain largely unresolved under current Indian IP jurisprudence and may eventually require dedicated regulatory or judicial clarification.
In that sense, AIKosh is not merely functioning as a technology repository. It is gradually becoming part of the legal and strategic infrastructure that could shape the future of AI innovation, competition, and intellectual property in India.
Sectoral Spotlight: Healthcare and Agriculture
AIKosh’s 20-sector taxonomy includes two areas where AI patent activity in India is most intense and where AIKosh-hosted resources are actively being deployed.
In healthcare, AI-powered diagnostic tools, drug discovery models, and medical imaging classifiers represent significant patentable innovation opportunities. IIT Madras researchers have separately developed AI-based mathematical models for identifying cancer-causing cellular alterations a class of innovation now more readily commercially viable given the 2025 CRI Guidelines.
In agriculture, crop yield optimisation models and pest detection systems built on localised datasets represent a frontier of India-specific AI innovation with limited prior art in global databases, a favourable environment for patent filing.
What Organisations Should Do Now
AIKosh is not just a data platform. It is a policy signal and an infrastructure investment that will directly shape the competitive landscape for AI IP in India over the next decade. For organisations active in the Indian AI innovation space, the immediate priorities are:
- Audit your models against AIKosh’s repository. Before filing any AI-related patent application, conduct a freedom-to-operate review against AIKosh-hosted models and datasets. The platform is searchable, and the models it hosts represent an increasingly material body of prior art.
- Sequence contributions carefully. If your organisation is considering contributing datasets or models to AIKosh for visibility, collaboration, or regulatory goodwill file provisional patent applications before any public disclosure.
- Leverage the 2025 CRI Guidelines. The new technical-effect standard is a genuine opportunity. AI innovations that previously faced Section 3(k) rejections may now be prosecutable, but applications must be carefully drafted to demonstrate concrete technical advancement, not merely algorithmic novelty.
- Participate in AIKosh’s ecosystem actively. The platform hosts innovation challenges (IndiaAI Innovation Challenge 2026), offers AI sandbox access, and provides API-based integration with government datasets. Early engagement positions organisations to shape the data and model assets that become foundational to the next generation of AI IP in India.
Conclusion
AIKosh represents a decisive step in India’s ambition to build sovereign, inclusive, and innovation-generating AI infrastructure. It is becoming the reference platform for AI development in India.
For the IP community, its significance goes beyond technology policy. It is reshaping the prior art landscape, introducing new open-access model repositories that will influence patent prosecution, and creating novel questions of ownership and attribution that Indian law has not yet fully answered.
The organisations that will lead the next wave of AI innovation and protect it effectively are those that understand AIKosh not merely as a government scheme, but as a structural feature of the Indian IP landscape that must be incorporated into every AI patent strategy from this point forward.





