Faster AI Video in ComfyUI: AccVideo for Wan 2.1

I’ve been testing the new AccVideo approach with Wan 2.1 inside ComfyUI, and the results are strong at very low step counts. The examples I generated were done in 8–10 steps, which already puts this method in a very efficient place for speed-to-quality.

According to the project page, speed improvements range from about 8.5× to 9.6× over the standard Wan Video workflow, with practical generation in 8–10 steps. Below, I’ll walk through the exact workflows, prompts, step counts, and configuration changes I used, plus measured times and quality notes.

What’s New and Why It Matters

Reported speed-up: 8.5× to 9.6× faster vs. a typical Wan Video pipeline
Practical step counts: 8–10 steps are often enough to get solid motion and detail
Works in ComfyUI: I tested with two workflows and recorded results at 6–15 steps

The goal here is simple: shorter runs, faster iteration, and acceptable quality without heavy tuning. You’ll see where the sweet spots form across steps and CFG values based on direct tests.

Get the Files and Model Setup

Download the AccVideo LoRA/model file from the project’s repository or release page.
Place the file in your ComfyUI models folder (LoRA goes to your lora directory).
Load the Wan 2.1 base (and the AccVideo LoRA) in your workflow.

I’ll share both ComfyUI workflows I used so you can reproduce these runs and tweak for your system.

Workflow Overview in ComfyUI

I used two separate workflows:

Workflow 1
- Base: Wan 2.1
- Sampler: UniPC
- Steps tested: 8, 10
- CFG tested: 3, then 4
- Prompt focus: a single subject in a street scene
Workflow 2
- Model reference: Dora ACC Video 14B
- Steps tested: 6, 7, 8, 9, 10, 13, 15
- CFG tested: 4, 6
- Extra setting noted: Shift = 7 (for a later test)
- Same prompt family, then a small prompt variation

Both workflows aim to keep settings minimal so the speed claim can be measured clearly. The exact step and CFG values are listed with each result below.

Workflow 1: Fast Tests at 8–10 Steps

Prompt Used

A realistic, beautiful woman with elegant features and long hair in the streets of Japan.

Sampler and Steps

Sampler: UniPC
Steps: 8 (first), then 10
CFG: 3 (first), then 4

Results and Notes

8 steps, CFG 3: The clip was decent for such a short run, but detail and stability could be better.
10 steps, CFG 3: Noticeable improvement—motion and appearance look more coherent.
8 steps, CFG 4: A bit better than CFG 3 at the same step count, but still short of the 10-step quality.

Early takeaway: 10 steps offers a clear bump over 8 steps here. CFG=4 gives a small lift at low steps, but step count matters more than the slight CFG change in this range.

Workflow 2: Expanded Tests and Lower-Step Limits

This workflow references “Dora ACC Video 14B” and sticks with the same baseline approach. I first tested at 13 steps, then pushed lower to see where quality drops too far, and finally explored mid-range values.

Prompt Used (same as Workflow 1 initially)

A realistic, beautiful woman with elegant features and long hair in the streets of Japan.

Step Counts and Observations

13 steps: Quality jumped over the earlier 8–10 step runs; a good balance of motion and detail.
6 steps: Failed test—too blurry and unstable. This is below the workable threshold for this setup.
8 steps: Back to acceptable. Holds together well for quick drafts.
9 steps: Strong result with the same prompt—recorded time about 292 seconds on my run.

At 9 steps, the prompt and scene remained unchanged, and quality was “pretty good” within the speed goals. For fast iteration, this felt like a workable target.

Prompt Variation and Tuning

After confirming the lower bound around 8 steps (and seeing 6 steps fail), I switched the prompt slightly to see how CFG and step counts interact with a more specific subject.

New Prompt

A realistic, beautiful American woman with elegant features and short black hair in the streets of Japan.

Test 1: 7 Steps, CFG 6

Result: Not bad but not great; still under the quality line I’d want for a final clip.
Note: At this low step count, pushing CFG higher didn’t fix the underlying lack of detail.

Test 2: 10 Steps, CFG 4, Shift 7

Result: Clearly better. Visual issues reduced, subject looks more confident on screen.
Time: About 333 seconds for 10 steps.
Note: A parked car in the background showed issues; low steps can miss complex shapes or fine detail. Step count helps more than CFG in cases like this.

Test 3: 15 Steps, same prompt

Result: Looked good overall, but hand detail was odd and there was a small mark on the forehead. These occasional artifacts still show up.
Conclusion: Based on multiple runs, 10–12 steps tends to be the best compromise between time and quality in this workflow.

What the Tests Suggest

6 steps is too low in this setup. Expect blur and instability.
8 steps is workable for quick previews.
9–10 steps hits a strong balance of quality and speed.
13 steps looks better than 10, but the time cost grows.
15 steps can give a bit more polish, yet artifacts can still appear; it’s not a guaranteed fix.
CFG changes help, but step count had the largest impact across tests.

Practical Settings That Worked

Base: Wan 2.1
AccVideo LoRA: Loaded in the LoRA slot
Sampler: UniPC
Steps:
- Fast drafts: 8
- Balanced: 9–10
- Higher detail: 12–13
CFG:
- 3–4 for low to mid steps
- 6 didn’t help much at very low steps in my tests
Extra setting referenced: Shift = 7 (used in the 10-step test with the second prompt)

These are not strict rules. They are the parameters that produced the results described above, in the same order they were tested.

Time and Quality at a Glance

Below is a quick summary of the runs I described, using the same or slightly varied prompts in the order they appear above.

Steps	CFG	Extra	Prompt Variant	Time (s)	Outcome Summary
8	3	—	Base prompt	—	Decent for speed, needs more detail
10	3	—	Base prompt	—	Noticeably better than 8 steps
8	4	—	Base prompt	—	Slight lift, still not as good as 10 steps
13	—	—	Base prompt	—	Quality improved over 8–10 steps
6	—	—	Base prompt	—	Failed: blurry and unstable
8	—	—	Base prompt	—	Acceptable again for quick drafts
9	—	—	Base prompt	~292	Strong balance; “pretty good” result
7	6	—	Short black hair variant	—	Not bad but not good enough
10	4	Shift=7	Short black hair variant	~333	Good; background car shows issues at low steps
15	—	—	Short black hair variant	—	Looks good; hands odd, small forehead mark

Notes:

Dashes mean I didn’t change or measure that field in that run.
Times vary by hardware and exact node configuration. I’m only listing the two measurements mentioned during the runs: ~292 s (9 steps) and ~333 s (10 steps).

Step-by-Step: Reproducing These Runs in ComfyUI

H2: Preparation

Download the AccVideo LoRA/model from the project page.
Place the file in your ComfyUI models folder (LoRA goes to the lora directory).
Ensure Wan 2.1 is installed and selectable in your nodes.

H2: Workflow 1

H3: Build the Graph

Load Wan 2.1 as the base model.
Add the AccVideo LoRA node and set its strength as per your usual LoRA practice.
Set the sampler to UniPC.
Use a standard video generation graph for Wan, with text prompt input and output nodes.

H3: Initial Run

Prompt: “A realistic, beautiful woman with elegant features and long hair in the streets of Japan.”
Steps: 8
CFG: 3
Run the graph and review the clip.

H3: Improve the Result

Increase steps to 10 (keep CFG 3).
Re-run and compare.
Try CFG 4 at 8 steps if you want to see the small difference at the same low step count.

H2: Workflow 2

H3: Build the Graph

Load the “Dora ACC Video 14B” configuration if your workflow references it.
Keep Wan 2.1 as the base with the AccVideo LoRA in place.
Sampler: UniPC (or the same sampler used earlier for consistency).
Same general video graph as Workflow 1.

H3: Runs in Order (to match the tests)

13 steps: Run with the same base prompt; observe the added stability and detail over 10 steps.
6 steps: Test to confirm the quality drop; expect blur and instability.
8 steps: Return to a workable preview level.
9 steps: Measure time if you want; my run was ~292 seconds and “pretty good.”

H3: Prompt Variation

Update the prompt to: “A realistic, beautiful American woman with elegant features and short black hair in the streets of Japan.”
7 steps, CFG 6: Expect results that are passable but below target quality.
10 steps, CFG 4, Shift 7: Expect a clear lift; my run took ~333 seconds.
15 steps: Evaluate artifacts. In my case, hands were odd and there was a small forehead mark.

Quality Tuning Notes

Step count matters most. The jump from 8 to 10 steps generally helped more than raising CFG at very low steps.
CFG around 3–4 works well for short runs. Increasing CFG to 6 didn’t fix the 7-step weakness in my tests.
Complex backgrounds (like cars) may show structural issues at low step counts. Raising steps is more effective than pushing CFG higher for those cases.
Artifacts can still appear at high steps. In my 15-step test, hand detail and a small forehead mark were present.

Recommended Ranges (Based on These Tests)

Quick previews: 8 steps, CFG ~3–4
Balanced runs: 9–10 steps, CFG ~3–4
Higher detail: 12–13 steps
Avoid: 6 steps (too blurry), 7 steps (borderline)
Extra setting that helped in one test: Shift = 7 (with 10 steps and CFG 4)

These align with the original goal of making Wan 2.1 video runs far faster without losing the essence of the scene.

Troubleshooting Checklist

If output is blurry:
- Increase steps to 9–10 before changing anything else.
- Keep CFG moderate (around 3–4) at low step counts.
If background objects break apart:
- Raise steps to 10–12.
If faces or hands look odd:
- Try 10–12 steps first.
- If issues persist, review your LoRA strength and prompt specificity.
If times are too long:
- Drop to 8–9 steps for previews.
- Keep the same seed so comparisons are fair across step counts.

Why 10–12 Steps Felt Best

Across the runs, 10–12 steps consistently offered:

Better motion stability than 8 steps
Stronger subject detail without a steep time penalty
Fewer visual issues in complex backgrounds

13 steps improved quality but added time. 15 steps didn’t guarantee artifact-free results. That’s why 10–12 steps became the practical target range.

Final Notes

I’ll share both ComfyUI workflows so you can replicate these tests and build on them.
The AccVideo approach with Wan 2.1 delivered meaningful speed gains in my runs, especially in the 8–10 step range.
For a good balance: start at 9–10 steps, CFG around 3–4, and adjust based on your scene complexity.

With that, you have the exact prompts, settings, times, and outcomes in the same order they were tested. Save the workflows, swap in your own prompts, and tune within the ranges above to match your hardware and visual goals.

Faster AI Video in ComfyUI: AccVideo for Wan 2.1

What’s New and Why It Matters

Get the Files and Model Setup

Workflow Overview in ComfyUI

Workflow 1: Fast Tests at 8–10 Steps

Prompt Used

Sampler and Steps

Results and Notes

Workflow 2: Expanded Tests and Lower-Step Limits

Prompt Used (same as Workflow 1 initially)

Step Counts and Observations

Prompt Variation and Tuning

New Prompt

Test 1: 7 Steps, CFG 6

Test 2: 10 Steps, CFG 4, Shift 7

Test 3: 15 Steps, same prompt

What the Tests Suggest

Practical Settings That Worked

Time and Quality at a Glance

Step-by-Step: Reproducing These Runs in ComfyUI

H2: Preparation

H2: Workflow 1

H3: Build the Graph

H3: Initial Run

H3: Improve the Result

H2: Workflow 2

H3: Build the Graph

H3: Runs in Order (to match the tests)

H3: Prompt Variation

Quality Tuning Notes

Recommended Ranges (Based on These Tests)

Troubleshooting Checklist

Why 10–12 Steps Felt Best

Final Notes

Recent Posts

AccVideo: Free Open-Source AI Video Generator

AccVideo vs Hunyuan: Faster AI Video in ComfyUI