I've produced two full courses with HeyGen. Not demos. Not test clips. Full production — an AI Leadership course for school leaders (8 modules, 39 segments, interactive branching, quizzes) and a professional accounting course conversion for a client in Southeast Asia, turning hundreds of hours of recorded lectures into avatar-delivered courseware.
That means I've hit every wall, worked around every limitation, and have strong opinions about what actually matters when you're building real content with this platform.
Here's what I wish someone had told me before I started.
Which Plan You Actually Need
HeyGen has four tiers. The pricing page makes them look straightforward. They're not.
| Plan | Price | Video Length | Resolution | Custom Avatars | Key Features | |------|-------|-------------|------------|----------------|--------------| | Free | $0 | 1-3 min | 720p | 1 video avatar | 3 videos/month, watermark | | Creator | $29/mo | 30 min | 1080p | 1 avatar + voice | 200 credits/month | | Business | $149/mo + $20/seat | 60 min | 4K | 5 avatars | SCORM, branching, quizzes | | Enterprise | Custom | Custom | Custom | Custom | Everything + SSO, API |
Annual billing drops Creator to $24/month and Business to $126/month. But here's the part that matters more than any of that.
The credit trap. HeyGen's Avatar IV — the good one, the one that looks realistic — costs 20 credits per minute of video. Creator gives you 200 credits per month. That's 10 minutes of Avatar IV video per month. Not 30 minutes. Ten.
Credits don't roll over. If you don't use them by month-end, they're gone. You can buy 300 additional credits for $15/month, but that still only gets you 25 minutes total.
If you're producing a single short video for social media, Creator works. If you're building a course or a training library, you'll burn through credits in two days and spend the rest of the month waiting.
My recommendation: Start on Creator to test your avatar and workflow. Move to Business the moment you need SCORM export, interactive branching, or more than 10 minutes of production per month. The jump from $29 to $149 is steep. But if you need the features, there's no middle ground.
Walkthrough 1: Creating Your Avatar
This is the part most guides gloss over. Here's every click.
Go to heygen.com and sign in. If you don't have an account, the free tier lets you test the platform before committing.
In the left sidebar, click Avatars. You'll see a library of stock avatars at the top and your custom avatars below (empty if you're new).
Click Create New Avatar in the upper right. You'll get several options. Select Start with Video to create a Digital Twin — this is the only option that produces a realistic version of your actual face. Photo avatars and AI-generated avatars exist, but for anything client-facing or course-grade, Digital Twin is the only one worth using.
Recording Your Source Video
HeyGen needs 2-5 minutes of video footage of you speaking naturally. This is the raw material the AI uses to learn your face, expressions, and mouth movements. Getting this right matters more than anything else in the process.
Camera: Your phone is fine. Seriously. A modern iPhone or Android phone shooting 1080p at 30fps produces better footage than most webcams. Prop it up at eye level — not on your desk angled up at your chin. If you have a tripod or a phone mount, use it. Stability matters.
Framing: Medium shot, chest up. Your head should fill roughly the top third of the frame. Too close and the AI amplifies every twitch. Too far and it loses detail on your mouth and eyes.
Eye line: Look directly at the camera lens. Not the screen. Not slightly above it. The lens. This is the single biggest factor in whether your avatar looks natural or slightly off. If you're using a laptop, stick a small dot next to the camera to remind yourself where to look.
Lighting: Face a window if you can. Natural light from the front is better than any ring light from the wrong angle. Avoid overhead fluorescent lighting — it creates shadows under your eyes that the AI reproduces faithfully. You want even, soft light on your face.
Clothing: Solid colors. No stripes, no busy patterns, no large logos. A plain shirt or blouse in a medium tone works best. White can blow out under bright lights. Black can merge with dark backgrounds.
Performance: Start with 2-3 seconds of silence while looking at the camera. Speak at a natural pace — don't rush, don't perform. Keep your lips fully closed during pauses between sentences. The AI uses those moments to learn your neutral resting expression. Keep hand movements below chest level and subtle. Dramatic gestures don't translate well.
The first 15 seconds matter most. Minimize blinking and stay as still as you can while the system calibrates your face. After that, relax into a natural delivery.
Uploading and Consent
Back in HeyGen, you'll see a screen that says Upload your video. Drag and drop your recorded file or click to browse. HeyGen accepts MP4, MOV, and WebM. Keep it under 500MB.
After uploading, HeyGen asks you to record a consent video. This is a legal requirement — they won't process your avatar without it. The screen will show a short paragraph for you to read aloud and a randomly generated 4-letter code. Record yourself reading the paragraph and saying the code clearly, looking at the camera. This proves you're a real person consenting to have your likeness used. It takes about 30 seconds.
Submit both videos. Processing takes anywhere from a few hours to a full day. HeyGen will email you when your avatar is ready.
When it's done, you'll find your new Digital Twin under Avatars in the left sidebar, in the "My Avatars" section. Click on it to preview how it looks with different scripts.
One warning: you can't reshoot individual scenes later without re-recording the entire avatar source. Get the source video right the first time.
Walkthrough 2: Creating Your First Video
Your avatar is ready. Here's how to turn it into an actual video.
Click Create in the top navigation bar. Select AI Studio. This opens the video editor.
You'll see three main areas. The script panel is on the left side — this is where you type what the avatar says. The canvas is the large area in the center — this is your visual preview showing the avatar, background, and any overlays. The timeline runs along the bottom — it shows your scenes in sequence, like slides in a presentation.
Setting Up Your Avatar
On the right side of the canvas, you'll see an avatar icon (it looks like a person). Click it to open the avatar selector. Find your Digital Twin in the list and click to place it on the canvas. You can drag it to reposition and resize it by pulling the corner handles.
Writing Your Script
Click in the script panel on the left. Type or paste your narration. There's a 2,000 character limit per scene — roughly 300-350 words. If your script is longer, you'll split it across multiple scenes.
Write the way you'd speak. Short sentences. Contractions always ("don't" not "do not", "we're" not "we are"). One idea per sentence. The avatar can't adjust emphasis or recover from awkward phrasing the way a human speaker can. If a sentence feels clunky when you read it aloud, it'll feel worse coming from an avatar.
Selecting Your Voice
Below the script panel, you'll see a voice dropdown. Click it to choose your voice. If you've created a voice clone through ElevenLabs or HeyGen's built-in voice cloning, it'll appear in your voice library. Otherwise, pick from the stock voices.
Adding Pauses
This is important and easy to miss. In the script panel, you can click the "+" icon next to the text to insert a pause. Pauses come in 0.5-second increments. I use a 1-second pause after every key point — it gives the viewer time to absorb what was just said. Without pauses, avatar narration feels like a machine reading a wall of text. With them, it feels like a person thinking.
Setting Your Background
Click the background icon on the right panel (it looks like a landscape image). You have three options: choose from HeyGen's stock backgrounds, upload a solid color, or upload your own image. For course production, you'll almost always upload your own — a branded slide, a photograph, or a plain backdrop.
Click Upload, select your image file (PNG or JPG, 1920x1080 recommended), and it becomes the scene background. Your avatar appears on top of it.
Adding More Scenes
At the bottom of the timeline, click "+ Scene" to add a new scene. Each scene is one "shot" — one background, one avatar position, one script block. A 5-minute video typically has 12-20 scenes.
Each scene gets its own script, background, and avatar positioning. You can copy a scene's settings to the next one if you want consistency, or change everything between scenes for visual variety.
Adding Overlays
Click the text icon or image icon in the toolbar above the canvas to add overlays — titles, bullet points, logos, shapes. These layer on top of your background and avatar. Useful for showing key terms or data while the avatar speaks.
Preview and Render
Click Preview at the top to watch individual scenes. Check the timing, make sure the script sounds natural, and verify that your overlays don't collide with the avatar.
When everything looks right, click Submit in the upper right to send the video for rendering. This is not instant. Rendering takes 5-15 minutes depending on video length. HeyGen will notify you when it's done.
Important: You cannot edit a video after submitting it for rendering. If something is wrong, you render the whole thing again. Preview carefully.
Importing Slides (The Course Production Workflow)
If you're building training content, you probably have slides already. This workflow is how I produced full courses — dozens of slides becoming dozens of scenes with avatar narration.
Click Create in the top navigation, then AI Studio to open a new project.
Before adding an avatar, look for the Import or Upload button near the top of the editor. Click it and select your PDF or PPTX file. HeyGen will process the file and convert each slide into a separate scene automatically. You'll see them appear in the timeline at the bottom.
Important: When the import dialog asks how to handle your file, choose "Image Background" — not "Video Background." This preserves your slide design exactly as you made it, with all your fonts, colors, and layouts intact.
Now click into each scene and add your avatar (right panel, avatar icon) and your script (left panel). Position the avatar where it won't cover critical content on the slide. I put mine in the lower-right corner for most slides.
This is how I produced a client's accounting course — 44 slides became 44 scenes, with the avatar delivering narration over each one. Design your slides with the avatar position in mind. Leave the lower-right quadrant clear of important content. I learned this after my avatar's head covered a key formula on three slides and I had to re-render the entire batch.
Interactive Features and SCORM Export
This is Business plan territory. If you're publishing to an LMS — LearnWorlds, Moodle, Canvas, anything SCORM-compatible — this is where the real value is.
What Business unlocks:
- Branching: Viewer clicks a button, video jumps to a different scene. I used this for scenario-based decision training — "A parent complains about AI in the classroom. What do you do?" Each choice leads to a different avatar response explaining the consequences.
- In-video quizzes: Multiple choice questions that pause the video. Customizable feedback per answer.
- Action buttons: Link to external URLs or jump to specific scenes.
- SCORM export: Packages your video as a SCORM 1.2 or 2004 file. Set a completion threshold (e.g., "80% watched = complete"). Upload to any LMS.
I tested SCORM export with SCORM Cloud and deployed to LearnWorlds. Completion tracking works. Quiz scores pass through. It's not flawless — you should always test in your target LMS before publishing to students — but it works.
The branching design constraint: Plan your branching on paper first. Each branch is a separate scene in the timeline. With 3 choices at 5 decision points, you're looking at dozens of scenes. Without a flowchart, you'll lose track fast. I use a simple spreadsheet: scene number, script, branches-to.
What I Wish I Knew Before Starting
After producing two full courses, these are the things that would have saved me the most time.
Previews lie. The in-editor preview doesn't always match the rendered output. Lip sync, timing, and overlay positioning can shift slightly. Always render a test scene before committing to a full video.
Lip sync quality varies. With stock HeyGen voices, lip sync is good. With ElevenLabs voice clones, it's inconsistent. Some words track perfectly. Others drift. You won't know until you render. Budget time for re-renders.
Always export at 1080p. I've had rendering bugs at lower resolutions. 1080p has been reliable.
HeyGen changes things. Plan inclusions have shifted without much warning — translation minutes got reduced mid-cycle on one of my billing periods. Check what your plan includes before you start a production run, not after.
Custom motion prompts are limited. You can prompt the avatar to gesture or move, but it only works well for clips under 10 seconds. For longer scenes, stick with the default natural movement.
The credit math one more time. If you're producing a 5-minute video with Avatar IV, that's 100 credits. Creator's 200 credits gives you two of those per month. Plan accordingly or plan to buy extra credits.
Get Started
HeyGen is the most capable AI avatar platform I've used for course production. It's not perfect — the credit system is aggressive, the Business plan price jump is real, and you'll spend more time on re-renders than you'd like. But the output quality is genuinely good enough for professional deployment.
If you want to try it: [start here][AFFILIATE_LINK]. The free tier gives you three videos to test whether the quality meets your standards before you spend anything.
If you want to see what a full HeyGen-produced course looks like — with interactive branching, SCORM export, and real instructional design behind it — I'm publishing the complete AI Leadership course at academy.kaiak.io. Free access to the first module.
Build something real with it. That's where the learning happens.
