Artificial intelligence firm Humyn Labs has committed $20 million to expand data collection operations across India, Southeast Asia, Latin America, and the West Asia, the company said in a press release on Monday.
The capital will finance the infrastructure required to train physical artificial intelligence systems (robots) and voice models. Co-founders Manish Agarwal and Ishank Gupta channeled the investment to organize and validate human intelligence, the release said.
Agarwal told Mint that Humyn, still in its early stages and not backed by any institutional or , is leveraging its revenue to invest $20 million into building high-quality datasets for physical AI, focusing on egocentric and conversational voice data. Egocentric in artificial intelligence refers to data from a first-person view captured by a human or agent while interacting with its surroundings.
The firm focuses on source-first data collection, recording first-person human activity, visuals, and movements. This data collection occurs within commercial, agricultural, and residential environments. The datasets capture how humans navigate and physically interact with surroundings to train physical , Agarwal explained.
The company is expanding its voice data infrastructure to encompass 33 languages, dialects, accents, and code-switching patterns. This expansion addresses the utilization of voice for real-world commands and human-robot interactions.
Humyn Labs will establish robotics labs to construct simulation environments and world models. This unit integrates real-world data with training frameworks to deploy physical systems. In AI systems, a world model is internal representation of how the real world works. For robotics labs, these world models are typically used to let robots learn in simulation before actual deployment.
Humyn AI utilizes a decentralized network across the global south to source data. The current pipeline is driven by roughly 15–18 customers. While Agarwal did not disclose any client names, he said that it targets top-tier labs where successful proof-of-concepts can quickly scale into $10–$15 million contracts.
The company’s growth is currently uneven. “While we have delivered approximately $2 million in revenue over the last few months, we are operating at an annualized run rate of around $4–5 million,” Agarwal said.
Agarwal said that Humyn has a sales pipeline of around $45–50 million. “Based on current execution, we expect to reach a $100 million Annual Recurring Revenue (ARR) by the end of December 2026.”
According to market research firm Fortune Business Insights, the global AI training dataset market size was valued at $3.59 billion in 2025, and is projected to grow from $4.44 billion in 2026 to $23.18 billion by 2034, with the India market projected be worth $190 million in 2026.
