In-Depth

Inside HeyGen: The Rise of an AI Avatar Empire and the Founders Rebuilding Video Creation

AI
·
21 min read

The subjects of this research are HeyGen itself and its two publicly confirmable cofounders, Joshua Xu and Wayne Liang. HeyGen’s official author pages identify Joshua Xu as CEO & Co-Founder and Wayne Liang as Chief Innovation Officer & Co-Founder; its privacy, terms, and biometric policy pages use the legal name HeyGen Technology Inc. As of HeyGen’s 2026 About page, the company publicly stated cumulative platform totals of 131,896,460 videos generated, 106,242,587 avatars generated, and 18,134,866 translated videos. The company also says it has helped 100,000+ companies and millions of users create video. HeyGen’s LinkedIn company page says it is used by a 30M+ user community and by 85% of the Fortune 100, while HeyGen’s own 2026 Fast Company announcement gives the more specific figure of 31 million signups. The exact timestamps behind these figures are not identical, but the larger conclusion is clear: HeyGen has already moved beyond being a small AI avatar tool and become a scaled AI video platform with meaningful enterprise penetration.

In compressed form, HeyGen is fundamentally a company built from the combination of Snap-era camera / recommendation engineering and Smule-style creator-product design. Its founders did not enter through the logic of traditional film production. They entered through the problem of how to let people who do not want to be on camera still communicate at scale, cheaply, and convincingly. Joshua Xu repeatedly says in public that “the camera is replaceable,” and HeyGen’s official “Why we build HeyGen” essay makes the underlying thesis even clearer: both founders describe themselves as introverts, and they started the company not because they loved being on camera, but because they did not. That product philosophy later shaped almost every major product direction at the company: digital twins, lip-synced translation, batch generation, enterprise training, sales videos, real-time avatars, and APIs.

The company’s trajectory is unusually legible. Joshua Xu wrote in an official retrospective that the company began in December 2020. The same official growth retrospective says the SaaS product launched on July 29, 2022, reached $1M ARR in 178 days, and became “ramen-profitable” in 217 days, with profitability achieved by April 2023. HeyGen’s official 2024 Series A announcement later said the company went from $1M ARR to $35M+ ARR in just over a year and had already turned profitable by Q2 2023. Bloomberg reported in June 2024 that HeyGen raised $60M at a $500M valuation, bringing total funding to $74M. By November 2025, a Forbes search snippet stated that the company had reached $100M in recurring revenue. That makes HeyGen more than a popular AI demo: it has crossed the core SaaS thresholds of paying demand, profitability, institutional financing, enterprise adoption, and material revenue scale.

In real-world positioning terms, I would define HeyGen as a first-tier application-layer AI video company, but not yet the uncontested category king. On the positive side, it has strong external validation signals, including G2’s #1 Fastest Growing Product ranking for 2025, Fast Company’s 2026 Most Innovative Companies recognition, and inclusion in the Forbes AI 50 for 2026. On the other hand, its public financing valuation in 2024 was roughly $500M, while competitor Synthesia’s public valuation in early 2025 was already $2.1B. In other words, HeyGen is best understood as a very fast-growing, design-strong, enterprise-credible leader in the category’s first tier, but not as the sole dominant player.

Founder Profiles and Development Paths
Publicly verifiable information on Joshua Xu is concentrated in education and career history, not in family background. On birth date, birthplace, parents’ occupations, class background, and early childhood resources, the best rigorous summary is: public information is limited / cannot currently be confirmed. What can be confirmed is that both LinkedIn and HeyGen’s official author page point to Carnegie Mellon University, and his LinkedIn education section lists MS in the Robotics Institute, School of Computer Science, Carnegie Mellon University. The same LinkedIn page also lists publications from 2012 and 2013, indicating serious technical training before his industry career.

Joshua’s first major representative career chapter was at Snap. In public interviews, he says he joined Snap in 2014 and spent roughly 6.5 years there. He began on Snapchat’s ads systems, working on machine learning, ranking, and recommendation, and later spent his final two years on AI camera technology. HeyGen’s official author page and external speaker/event bios describe this period in similar terms: he was a lead engineer or engineering leader driving ads ranking, machine learning, computational photography, and AI camera technology. This matters because it explains why he did not go on to build a conventional editing tool. He came from the question of whether the camera itself could be replaced by generation models.

Joshua’s intellectual turning point appears to have come from his 2018-era work on Snap’s AI camera stack. In both the Unite.ai interview and the No Priors conversation, he explains that building AI-enhanced camera features and filters led him to realize that a computer could create high-quality video effects and ultimately generate content that did not exist in the physical world. That made him believe AI would fundamentally change content creation. His core bet was not merely that AI could improve video production efficiency, but that AI could become the new camera. That is the deepest philosophical starting point behind HeyGen.

Wayne Liang has a similar public-information pattern: more career detail than family detail. On birth date, birthplace, parents, family class background, and childhood, the careful summary is again: public information is limited / cannot currently be confirmed. What can be confirmed is that his LinkedIn profile points to Carnegie Mellon University, and HeyGen’s official author page explicitly says he worked at Smule before HeyGen. The official bio says that role exposed him to the fact that creative expression is often constrained not by talent, but by the friction of presentation itself. His current official role is to shape human-centric AI video experiences.

Wayne’s earlier career is less fully documented than Joshua’s in strong public English sources. Among authoritative public sources, HeyGen’s official materials clearly confirm Smule; a Forbes search summary also describes him as having worked in product design at Smule and ByteDance. Because HeyGen’s own current author page does not fully spell out the detailed role sequence, the most cautious formulation is this: Smule is clearly confirmed, while a fuller early-career timeline remains publicly limited. Even so, Wayne’s functional role in the company is very clear. He is not a passive capital-side cofounder; he is a core product and experience builder. The observational texture in HeyGen’s official “Why we build HeyGen” essay—creators doing endless retakes and still refusing to publish—has a distinctly product-design lens.

The pairing of the two founders is best understood as an “engineering + product + creator-psychology” combination. In HeyGen’s 2023 Movio rebrand post, Joshua wrote that Wayne and I crossed paths at school, and that after graduation both had spent nearly a decade in the video content industry. The same post reveals one of the company’s most important analytical choices: after decomposing the video-production workflow, they concluded that editing was not the expensive bottleneck; the camera stage was. That is why HeyGen first replaced the human-on-camera layer rather than first optimizing post-production. The sequence of products that followed—avatars, talking photo, translation, then AI Studio, Video Agent, and LiveAvatar—reflects that original diagnosis.

Joshua has also described the company’s startup method in unusual detail. In the official growth retrospective, he explains that before the polished SaaS launch, the team used Fiverr to sell on-demand multilingual spokesperson videos. At first they did not even explicitly disclose that avatars were AI-generated; they simply delivered similar outputs faster and cheaper. Their first paying customer spent just $5. This is important because it shows HeyGen did not begin with a flashy demo in search of a market. It began by testing whether people would actually pay for substitute video presence, then productized the service. Joshua frames this explicitly as validating AI-market-fit.

Company Evolution and Business Structure
HeyGen’s history is not a single-brand line. It is better understood as Surreal → Movio → HeyGen. SCMP reported in 2024 that the company was founded in Shenzhen in 2020, was initially known as Surreal, moved to Los Angeles in 2022 and used the Movio brand, and then rebranded to HeyGen around April 2023. In the official rebrand post, Joshua said the Movio product had already generated 2M+ interactive video examples within nine months of launch, the team had grown to 30 people, and the company had shipped 32 versions and 100+ features. The meaning of these rebrands is strategic: the company moved from a relatively narrow spokesperson-video tool toward a broader AI video generation platform.

The period from 2022 to 2024 was when HeyGen found product-market fit and converted that into capital and scale. Its official retrospective confirms a July 29, 2022 launch, $1M ARR in 178 days, and “ramen profitability” in 217 days. By the time of the official 2024 Series A announcement, the company said it had jumped from $1M ARR to $35M+ ARR and had already become profitable in Q2 2023. Bloomberg’s June 2024 report added the external financing layer: $60M raised, $500M valuation, and $74M total funding to date. The official Series A post also stated that HeyGen was then serving 40,000+ paying business customers worldwide. Structurally, this is a compressed SaaS growth pattern: paid demand first, profitability early, then large financing—not years of pure burn before commercial proof.

From 2025 into 2026, HeyGen clearly started repositioning itself from an avatar-video tool into a fuller video infrastructure layer. Its homepage and product updates show that the company is no longer only about AI avatars / digital twins / talking photo / video translation / localization / voice cloning. It now includes AI Studio, Video Agent, Interactive Video, SCORM export, LMS integrations, LiveAvatar, and API / MCP-based developer access. The March 2026 official release is especially revealing: Brand Systems, Interactive Video, 4K enhancement, pay-as-you-go API, distribution via fal / Replicate / Runware, and MCP availability on Claude, Manus, and OpenAI. That is not the shape of a single consumer web tool anymore. It is the shape of a platform trying to become a video capability layer inside enterprise and agentic workflows.

If we separate HeyGen’s brands, assets, organizations, and platforms into categories, two stand out. The first category is true asset-like infrastructure: the heygen.com core platform, the LiveAvatar real-time product and domain, the API business, enterprise workflow integrations, digital twin generation, translation and lip-sync systems, customer subscriptions, and developer access rails. The second category is better described as influence assets: Customer Stories, the community and help center, webinars, integrations such as Canva, and external trust markers like Fast Company, G2, and Forbes recognition. The first category directly drives revenue and defensibility. The second drives trust, distribution, and acquisition efficiency. In other words, HeyGen is not a media company or a foundation-like organization. It is a software platform company with a strong narrative layer wrapped around the product.

Its business model is already relatively complete. The official pricing page lists Free, Creator at $29/month, Pro at $49/month, Business at $149/month, with additional seats priced at $20 per seat per month, and Enterprise sold through custom contracts. What Business and Enterprise add is not just more templates; they add SSO, centralized billing, team collaboration, draft commenting, Interactive Video, SCORM export, LMS integrations, brand systems, access controls, and enterprise privacy/security. On top of that, the API side moved in 2026 to pay-as-you-go, starting at $5, with no monthly commitment required. That means HeyGen effectively monetizes through three stacked layers: self-serve subscriptions, team/enterprise seats, and API usage, plus add-ons such as premium credits and extra digital twins. That is increasingly the revenue architecture of a mature SaaS platform rather than a one-off creative tool.

The customer stories make clear that HeyGen’s strongest achievement is not that “AI is trendy,” but that it demonstrably saves time, cuts cost, expands into new languages, and scales output. Official case studies say Würth cut translation costs by 80% and halved production time; Tomorrow.io saved 2–3 months per year of video production time and reduced delivery from one week to two days; The Economist used the platform to scale multilingual journalism while trying not to sacrifice editorial integrity; educator Anton Voroniuk reached 1M+ students and reduced video content cost to 1/40th of the traditional level. The official Series A announcement adds another set of signal-heavy use cases: McDonald’s, Salesforce, Argentine President Javier Milei’s WEF speech, Wisetech Global, the Mayor of Yokosuka, and others. What people remember about HeyGen is not merely the avatar effect. It is the fact that the company is turning video from a heavy production category into a lightweight operating capability.

The current quantitative picture reinforces that interpretation. HeyGen’s official About page lists more than 131.8M videos, 106.2M avatars, and 18.1M translated videos, while its customer-logo area includes names such as HubSpot, Workday, HP, Trivago, J.P. Morgan, Autodesk, Miro, Intel, DHL, Bosch, Komatsu, Coursera, and Spring Health. Its LinkedIn company page places it in the 51–200 employee size band. So the company is still organizationally lean relative to its output, but it is using software leverage and model leverage to support a content-production footprint much larger than its headcount would normally suggest.

Capital Network, Controversies, and Current Position
HeyGen’s capital structure falls into two stages. Early on, it had a visibly China-linked investor base. The Financial Times and SCMP both reported that early Chinese investors included IDG Capital, Baidu Ventures, HongShan, and ZhenFund. By late 2023 and especially 2024, the company was clearly rotating toward a U.S.-led cap table. Public reporting and databases indicate that HeyGen raised $5.6M in 2023 from Conviction; Bloomberg reported that the $60M Series A in June 2024 was led by Benchmark, with participation from Conviction, Thrive Capital, and Bond Capital, and that Benchmark partner Victor Lazarte joined the board. SCMP also named additional new and returning supporters including Dylan Field, Elad Gil, Aviv Nevo, Neil Mehta, and SV Angel. This capital shift mattered not only because of money but because it increased the company’s compliance runway, enterprise acceptability, and access to mainstream U.S. financing networks.

Behind that financing shift was a more structural decision: reducing Chinese investor and operating-entity exposure. FT reported that HeyGen asked its Chinese backers to sell shares to U.S. counterparts as scrutiny of China-linked ownership intensified in the American market. SCMP went further, writing that HeyGen had dissolved its mainland Chinese operation ahead of the Series A and encouraged Chinese investors to exit in favor of U.S. investors. This decision was strategically important because it shaped whether HeyGen could be accepted as a mainstream enterprise supplier in the U.S., whether it could attract top-tier American VC support, and whether it would be seen as a compliance-safe AI application company rather than a geopolitically sensitive one. In practical business terms, this was one of the company’s most consequential scaling decisions.

If we isolate the most important decisions made by Joshua Xu and Wayne Liang, four stand out. First, replacing the camera before optimizing editing. Second, validating willingness to pay through Fiverr before building a polished SaaS product. Third, moving from Shenzhen / China-linked ownership toward Los Angeles and a U.S.-oriented cap table. Fourth, upgrading from an avatar tool into an enterprise workflow and API platform. Each of these solved a different bottleneck: real demand, payment validation, market/compliance identity, and long-term platform defensibility. That sequence is a large part of why HeyGen managed to move quickly through the demo stage, the paid stage, the compliance stage, and the enterprise stage.

The positive side of those choices is obvious. The negative side is that they placed HeyGen directly in the most sensitive controversy zone of generative video. The central controversy is not a classic financial scandal but the knot of deepfakes, consent, likeness rights, downstream misuse, and platform responsibility. The Financial Times reported that one of influencer Olga Loiek’s deepfakes was created using HeyGen tools, and that HeyGen technology is also accessed through other products via software plug-ins, making it hard to police every downstream use. The Washington Post separately documented cases in which ordinary women’s faces were stolen and turned into AI ads. The core criticism here is not simply whether HeyGen has rules. It is whether rules can actually be enforced at the far edge of the ecosystem once generation capability spreads through integrations and intermediaries.

To be clear, HeyGen has not taken a laissez-faire posture in public materials. Its official moderation policy explicitly requires explicit consent for custom avatars, prohibits creating avatars of real people without consent, gives represented individuals takedown rights, and forbids minors, public figures without consent, and infringing or harmful imagery. Its security and trust pages highlight SOC 2 Type II, GDPR, CCPA, DPF, and EU AI Act compliance language, and say enterprise data is excluded from model training by default. Joshua also said in 2024 that the company uses live video consent, dynamic verbal passcodes, and human review as part of verification. The unresolved issue, however, is not whether safeguards exist. It is whether safeguards are sufficient against the broader externality of generative video misuse. So the main debate around HeyGen is not “no safety,” but the structural tension between increasingly powerful avatar generation and the limits of platform-side control.

A second, softer but real controversy cluster concerns pricing and plan communication. HeyGen’s help center says the old “Unlimited” plans were deprecated after May 15, 2026. At the same time, HeyGen’s own community forum shows some users raising strong complaints about unlimited-plan interpretation, translation limits, refunds, and plan changes. These materials should not be treated as court-verified findings or universal facts, but they do show that as HeyGen moved from early hypergrowth into more mature product operations, it began hitting a classic SaaS problem set: pricing redesign, entitlement reduction, and expectation management. That is not the same as a major scandal, but it can absolutely affect reputation and retention.

As of 2026, HeyGen’s real-world footprint is already quite concrete. Its official About page lists offices in Los Angeles, San Francisco, Palo Alto, and Toronto. Public company materials and LinkedIn together point to 100,000+ companies, 31M signups, 85% Fortune 100 penetration, and a two-track self-serve plus enterprise business. HeyGen’s official Fast Company announcement says users generated 101 million minutes of video in 2025, which was 4x the volume of all of 2024. A November 2025 Forbes snippet says recurring revenue had reached $100M. So HeyGen is no longer accurately described as “just a deepfake site.” It is better understood as an AI video infrastructure company that is combining avatars, localization, real-time presence, and enterprise workflow tooling into a new communications layer. Its true position in the world can be summarized this way: one of the strongest first-tier application-layer companies in generative video; unusually strong in growth, product density, and branding; but its long-term ceiling will depend heavily on two things—enterprise trust and the ongoing governance of deepfake externalities.