Show HN: Ez FFmpeg – Video editing in plain English
I built a CLI tool that lets you do common video/audio operations without remembering ffmpeg syntax.Instead of: ffmpeg -i video.mp4 -vf "fps=15,scale=480:-1:flags=lanczos" -loop 0 output.gifYou write: ff convert video.mp4 to gifMore examples: ff compress video.mp4 to 10mb ff trim video.mp4 from 0:30 to 1:00 ff extract audio from video.mp4 ff resize video.mp4 to 720p ff speed up video.mp4 by 2x ff reverse video.mp4There are similar tools that use LLMs (wtffmpeg, llmpeg, ai-ffmpeg-cli), but they require API keys, cost money, and have latency.Ez FFmpeg is different: - No AI – just regex pattern matching - Instant – no API calls - Free – no tokens - Offline – works without internetIt handles ~20 common operations that cover 90% of what developers actually do with ffmpeg. For edge cases, you still need ffmpeg directly.Interactive mode (just type ff) shows media files in your current folder with typeahead search.npm install -g ezff
310 points by josharsh - 141 comments
It's incredible what lengths people go to to avoid memorizing basic ffmpeg usage. It's really not that hard, and the (F.) manual explains the basic concepts fairly well.
Now, granted, ffmpeg's defaults (reencoding by default and only keeping one stream of each type unless otherwise specified) aren't great, which can create some footguns, but as long as you remember to pass `-c copy` by default you should be fine.
Also, hiding those footguns is likely to create more harm than it fixes. Case in point: "ff convert video.mkv to mp4" (an extremely common usecase) maps to `ffmpeg -i video.mkv -y video.mp4` here, which does a full reencode (losing quality and wasting time) for what can usually just be a simple remux.
Similarly, "ffmpeg extract audio from video.mp4" will unconditionally reencode the audio to mp3, again losing quality. The quality settings are also hardcoded and hidden from the user.
I can sympathize with ffmpeg syntax looking complicated at first glance, but the main reason for this is just that multimedia is really complicated and that some of this complexity is necessary in order to not make stupid mistakes that lose quality or waste CPU resources. I truly believe that these ffmpeg wrappers that try to make it seem overly simple (at least when it's this simple, i.e. not even exposing quality settings or differentiating between reencoding and remuxing) are more hurtful than helpful. Not only can they give worse results, but by hiding this complexity from users they also give users the wrong ideas about how multimedia works. "Abstractions" like this are exactly how beliefs like "resolution and quality are the same thing" come to be. I believe the way to go should be educating users about video formats and proper ffmpeg usage (e.g. with good cheat sheets), not by hiding complexity that really should not be hidden.
Edit: Reading through my comment again, I have to apologize for the slightly facetious opening statement, even if I quality it later on. The fact that so many ffmpeg wrappers exists is saying something about its apparent difficulty, but as I argue above, a) there are reasons for this (namely, multimedia itself just being complicated), and b) I believe there are good and bad ways to "fix" this, with oversimplified wrappers being more on the "bad" side.
I've learned not to say this. Different things are easy/hard for each of us.
Reminds me of a discussion where someone argued, "why don't all the poor/homeless people just go get good jobs?"
Edit: I know your comment was meant to inspire/motivate us to try harder. Maybe it's easier than it appears.
Now, I can simply ask any LLM to write the command, and understand any following issues or questions.
For example, my OS records videos as WEBM. Using the default settings for transforming to MP4 usually fails from a resolution ratio issue. I would be deadlocked using this library.
It really isn't that hard anymore.
In my opinion there are two kinds of users: 1. Users who use FFmpeg regularly enough to know/understand the parameters. 2. Users who only use FFmpeg once in a while to do something specific.
This wrapper is superfluous for users in group number 1. But group number 2 does not really get much out of it either, for the reasons you've mentioned.
As a member of group 2, I usually want to do something very specific (e.g. remove an audio track, convert only the video, remux to a different container, etc.). A simple English wrapper does not help me here because it is not powerful enough; the defaults are usually not what I want. What I need is a tool that will take a more detailed English statement of what I want to achieve and spit out the FFmpeg command with explanations for what each parameter does and how it achieves my goal. We have this today: AI; and it mostly works (once you've gone through several iterations of it hallucinating options that do not exist...).
if you are doing it often that's true. But for people like me who do it once every month or two it really is hard to memorize, especially if it's not exactly the same task.
What I would love would be an interactive script that asked me what I was trying to do and constructed a command line for me while explaining what it would do and the meaning of each argument. And of course it should favour commands that do not re-encode where possible.
Start the tool, and just list all of the options in order of usage popularity to toggle on as desired, with a brief explanation, and a field to paste in arguments like filenames or values when needed. If an option is commonly used with another (or requires it), provide those hints (or automatically add the necessary values). If a value itself has structure (e.g. is itself a shell command), drill down recursively. Ensure that quotes and spaces and special characters always get escaped correctly.
In other words, a general-purpose command-line builder. And while we're at it, be able to save particular "templates" for fast re-use, identifying which values should be editable in the future.
I can't be the first person to think of this, but I've never come across anything like it and don't understand why not. It doesn't require AI or anything. Maybe it's the difficulty involved in creating the metadata for each tool, since man pages aren't machine-readable. But maybe that's where AI can help -- not in the tool itself, but to create the initial database of tool options, that can then be maintained by hand?
(Navi [1] does the templating part, but not the "interactive builder" part.)
[1] https://github.com/denisidoro/navi
Or if no telemetry but based on local usage, it would promote/reinforce the options you already can recall and do use, hiding the ones you can’t/don’t?
That said, I started wrtiting scripts when I use ffmpeg some time ago. At least then I have a non-zero starting point next time.
I find myself bothering exactly zero times to memorise this obnoxiously long command line. Claude fills in, and I can explore features better. What’s not to like? That I’m getting dumber for not memorising pages of cli args?
Love the project, but as with every Swiss knife this conversation is a thing and relevant. We had similar one reg JQ syntax and I’m truly convinced JQ is wonderful and useful tool. But I’m not gonna bother learning more DSLs…
It is only a couple of thousand options[0], just memorize them! It super simple, barely an inconvenience!
[0]https://gist.github.com/tayvano/6e2d456a9897f55025e25035478a...
It's not hard - just not a good use of our time. For 99% of HN users, ffmpeg is not a vital tool.
I have to use it less than twice a year. Now I just go and get an LLM to tell me the command I need.
And BTW, I spend a lot of time memorizing things (using spaced repetition). So I'm not averse to memorizing. ffmpeg simply doesn't warrant a place in my head.
I’m going to guess your job does not involve much UX design?
- making sure pixel are square while resizing if the video resolution is too large
- dealing with some HDR or high gamut thing I can't really remember that can result from screen recording on macos using some method I was using at some point- setting this one tag on hevc files that macos needs for them to be recognised as hevc but isn't set by default
- calculating the target bitrate if I need a specific filesize and verifying the encode actually hit that size and retrying if not (doesn't always work first time with certain hardware encoders even if they have a target or max bitrate parameter)
- dealing with 2-pass encoding which is fiddly and requires two separate commands and the parameters are codec specific
- correctly activating hardware encoding for various codecs
- etc
And this is just for the basic task of "make this into a simple mp4"
But my issue with the linked tool is that it does none of the things you mentioned. All it does it make already very easy things even easier. Is it really that much harder to remember `ffmpeg -i inputfile outputfile.ext` than `ff convert inputfile to ext`?
I've explained this in other replies here but I am neither saying that ffmpeg wrappers are automatically bad, nor that ffmpeg cannot be complicated. I am only saying that this specific tool does not really help much.
Personally I think it’s great that it’s such a universally useful tool that it has been deployed in so many different variations.
> some folks want to use lossless cut
In that case I would encourage you to ruminate on what the following in the post you're replying to means and what the implications are:
> "ff convert video.mkv to mp4" (an extremely common usecase) maps to `ffmpeg -i video.mkv -y video.mp4` here, which does a full reencode (losing quality and wasting time) for what can usually just be a simple remux
Depending on the size of the video, the time it would take you to "do the job swiftly" (i.e. not caring about how the tools you are using actually work) might be more than just reading the ffmpeg manual, or at the very least searching for some command examples.
You may have misunderstood the comment: "lossless cut" is the name of an ffmpeg GUI front end. They're not discussing which exact command line gives lossless results.
>It's incredible what lengths people go to to avoid memorizing basic ffmpeg usage. It's really not that hard, and the (F.) manual explains the basic concepts fairly well.
Not really sure how else I was supposed to interpret your comment but clarification taken.
> But I argue in my comment above that this specific tool does not have better QoL
For some folks it may be better/more intuitive. It doesn’t hurt anybody by existing.
We all compromise with different tools in our lives in different ways. It just reads to me like an odd axe to grind.
Simply put: What is so bad about the existence of this project?
Yes, that was a bit facetious of me, I apologize for that.
> What is so bad about the existence of this project?
Being very blunt: The fact that it reinforces the extremely common misconception that a) converting between containers like mkv and mp4 will always require reencoding and that b) there is a single way to reencode a video (hence suggesting that there is no "bad" way to reencode a video), seeing as next to no encoding settings are exposed.
As the kids these days say: just take the L, man.
I personally use lossless cut more than ffmpeg in the terminal just because I don’t have to really think about it and it can do most of what I need, which is simply removing or attaching things together without re-encoding. I use it maybe once every month or two, because it’s just not something I need to use a ton, so it doesn’t make sense for me to get down and dirty with the original. Ultimately I get what I need and I’m happy!
There isn't internal consistency to really hold on to ... it's just a bunch of seemingly independent options.
The biggest problem is open source teams really don't get people on board that focus on customer and product the way commercial software does. This is what we get as a result
Sure, I agree with all of this. Like I said above, the syntax (and, even more, the defaults) isn't great. I'm just arguing that "improving the syntax" should not mean "hiding complexity that should not be hidden", as the linked project does. An alternative ffmpeg frontend (i.e. a new CLI frontend using the libav* libraries like ffmpeg is, not a wrapper for the ffmpeg CLI program) with better syntax and defaults but otherwise similar capabilities would be a very interesting project.
(The answer to your question is that both -vcodec and -c:v are valid, but I imagine that's not the point.)
> The biggest problem is open source teams really don't get people on board that focus on customer and product the way commercial software does.
I believe in this case it may be more of a case of backwards compatibility, with options being added incrementally over time to add what was needed at the moment. Though that's just my guess.
[1] https://blog.pkh.me/p/21-high-quality-gif-with-ffmpeg.html
But one case I see often: If you’re making a website with an animated gif that’s actually a .gif file, try it as an mp4 - smaller, smoother, proper colors, can still autoplay fine.
/s
The problem is someone decided that and the contents of Wikipedia was all something needs to be intelligent haha
It is almost like there is hardwiring in our brains that makes us instinctively correlate language generation with intelligence and people cannot separate the two.
It would be like if for the first calculators ever produced instead of responding with 8 to the input 4 + 4 = printed out "Great question! The answer to your question is 7.98" and that resulted in a slew of people proclaiming the arrival of AGI (or, more seriously, the ELIZA Effect is a thing).
> Write an ffmpeg command that implements the "bounce" effect: play from 0:00 to 0:03, then backwards from 0:03 to 0:00, then repeat 5 times.
Maybe this should be an AI reasoning test.
Here is what eventually worked, iirc (10 bounces):
... Provided that the user sees what's being made for them and can confirm it and (hopefully) learn the target "language."
Tutor, not a do-for-you assistant.
* - Just a few days ago I used ImageMagick for the first time in at least three years. I downloaded it just to find that I already had it installed.
It isn’t fair to say “since I don’t read the source of the libraries I install that are written by humans, I don’t need to read the output of an llm; it’s a higher level of abstraction” for two reasons:
1. Most Libraries worth using have already been proven by being used in actual projects. If you can see that a project has lots of bug fixes, you know it’s better than raw code. Most bugs don’t show up unless code gets put through its paces.
2. Actual humans have actual problems that they’re willing to solve to a high degree of fidelity. This is essentially saying that humans have both a massive context window and an even more massive ability to prioritize important things that are implicit. LLMs can’t prioritize like humans because they don’t have experiences.
You can’t verify LLM’s output. And thus, any form of trust is faith, not rational logic.
With an LLM’s output, it is short enough that I can* put in the effort to make sure it's not obliviously malicious. Then I save the output as an artefact.
* and I do put in this effort, unless I'm deliberately experimenting with vibe coding to see what the SOTA is.
In the case of npm and the like, I don't trust them because they are actually using insecure procedures, which is proven to be so. And the vectors of attacks are well known. But I do trust Debian and the binaries they provide as the risks are for the Debian infrastructure to be compromised, malicious code in in the original source, and cryptographic failures. All threes are possibles, but there's more risk of bodily harm to myself that them happening.
And, realistically, compute and power is cheap for getting help with one-off CLI commands.
A sane homogeneous cli for once, that treats its user as a human instead of forcing them to remember the incompatible invocation options of `tar` and `dd` for absolutely no reason.
And add amazing autocomplete, while allowing as many wordings as possible. No need for LLMs.One can dream.
Dang! not that one, the other one!
> zip my-folder into my-zip.tar with compression level 9
What do you mean, I don't have write permissions in the current working directory? I meant for you to put the output in $HOME, i mean /tmp, i mean /var/tmp, i mean on the external hard drive, no other other one.
> git delete commit 1a4db4c
What did you do? I didn't mean delete it and erase it from the reflog and run gc! I just mean "delete it" the way any one would ever mean that! I can never get it back now!
I would prefer not to change the technical aspects of Linux. I actually cherish it.
https://github.com/dheera/scripts/blob/master/helpme
Example usage:
This originated from an ffmpeg wrapper I wrote but then realized it could be used for all commands:https://news.ycombinator.com/item?id=40410637
"Hey computer, can you convert that funny kitchen cooking scene in this movie to a .gif I can share online?"
You're wasting your time on a dead man walking paradigm doing anything else. "Plain English" actually means plain English now.
Memorising command line options beyond the absolute basics has rarely been helpful to me. And I use FreeBSD, where arcane commands are plentiful.
One workaround is that when there is syntax error, let user optionally switch to LLM?
Using a different package name could be helpful. I searched for ezff docs and found a completely different Python library. Also ez-ffmpeg turns up a Rust lib which looks great if calling from Rust.
Which format is the default if no argument is given?
Or more complicated contextual knowledge - if you cut 1sec of a video file, does fish autocomplete to tell you whether the video is reencoded or cut (otherwise) losslessly
Also, what does fish complete to on Windows?
I know what I want to do, I don't know how it's being done, but there's a wealth of information that is very accessible. So I just read it.
It's very easy to type `apropos ffmpeg`. And even if you typed `man ffmpeg`, if you go to the end, you will find related manuals name for more information. And you can always use the pager (`less` in most case) facility for quick search.
I believe that a lot of frustration comes from people unwilling to learn the conceptual basis of the tools they are using.
> It's very easy to type `apropos ffmpeg`
No it's not. First, that's not a Windows command, so right off the bat you've cut off the largest OS. Second, your command is naively empty and it's telling that you've given it instead of an actual search query because you wouldn't be able to come up with a great one right away that would result in the correct result at the top - while the correct resuls is "hardcoded" in the field type in the UI. So yeah, go on, find that perfect query and then explain why you think every single user should be able to do the same quickly. Then you can think about how justified your other beliefs are about basic workflow issues you don't understand
Then any solutions is broken in this way. Even my bluetooth speaker comes with a manual. Not reading it and saying the speaker is broken, because you can't figure how to connect is pure delusion. Same as not reading ffmpeg manual and expecting to know how to use it.
> First, that's not a Windows command, so right off the bat you've cut off the largest OS.
ffmpeg on Window is so far the beaten path that it may as well be in Mordor. I would gladly bet that someone that knows how to run ffmpeg on windows also knows how to find the documentation for it.
> So yeah, go on, find that perfect query
Why would I find the perfect query? Do you go in the library and then find the correct line of the correct book in one go? Or do you consult the list of books of books for a theme, select a few candidates, consult their index, and then read the pages?
Then all of that is left to do is to note down the reference if you need to consult the book again (no need to remember everything).
Nope, you're just doing the same thing - purposefully ignoring the issue to make your non-solution comparable...
> Even my bluetooth speaker comes with a manual.
... in this case - the length and scope of the manual. First, you can operate the speaker without the manual or with just a single read of the manual- so spend a few seconds to learn how to pair (but you might not even need that as "hold to pair" might be something you remember from other devices), then the power/volume buttons require no manual because you've operated such buttons your whole life.
> Same as not reading ffmpeg manual
Of course it's not the same, the ffmpeg manual isn't a tiny page of 5 items, and no other apps will help you learn the peculiarities of ffmpeg. Also, the whole point of intuitive UI with "typed info" is that you don't need to read that huge manual to do the basics as you can simply follow the structure laid out by someone more knowledgeable
> ffmpeg on Window is so far the beaten path that it may as well be in Mordor. I would gladly bet that someone that knows how to run ffmpeg on windows also knows how to find the documentation for it.
Who would take that irrelevant bet? The issue isn't in finding! the manual!
> Why would I find the perfect query?
To prove that your solution works. I know it doesn't and challenge you to prove otherwise. Your suggestion is worse than asking users to Google, because at least there users will likely get the correct top result in a few tries for common needs
> Do you go in the library and then find the correct line of the correct book in one go?
No, I open an app and pick the correct format from the drop-down menu correctly in one go
> Or do you consult the list of books of books for a theme, select a few candidates, consult their index, and then read the pages?
Oh man, even in your fantasies you can't come up with a good workflow! No wonder you're fine suggesting everyone wastes a lot of time aproposing empty queries
It's the same with video viewers or music players. Often the default app of the OS is enough and they are very intuitive. But sometimes you need a bit more control and that's when using something like vlc or mpv which their extensive filter capabilities (which requires to have the doc at hand) is mandatory.
ffmpeg interface is ok for what it does. Any of your suggestion would be complex to implement if it aims to support the whole feature set of ffmpeg.
lol
Could you elaborate on this? I see a lot of AI-use and I'm wondering if this is claude speaking or you
https://github.com/sirodoht/llmwrap
It will sample images from the video then go crop the video to that, stabilize if required, and then make me an optimized GIF that I can put in my weekly journal.
Quite telling that these tools need to exist to make ffmpeg actually usable by humans (including very experienced developers).
If one has fewer such commands its as simple as just bash aliases and just adding it to ~/.bashrc
alias convertmkvtomp4='ffmpeg command'
then just run it anytime with just that alias phrase i use ffmpeg a lot so i have my own dedicated cli snippet tool for me, to quickly build out complex pipeline in easier language
the best part is i have --dry-run then exposes the flow + explicit commands being used at each step, if i need details on whats happening and verbose output at each step
But yea ffmpeg is awesome software, one of the great oss projects imo. working with video is hellish and it makes it possible.
Is there an easier way?
I think it was an M4 Mac. Does iMovie need a codec pack? I know some PC OEMs don't ship an h.265 codec, pointing users to a $0.99 download. Thought Mac would include it, being aimed at content creators. Hoping for a cheaper solution than Adobe Premiere.
To re-encode the content into H.264+AAC, rather than simply "muxing" the encoded bitstreams from the MP4 container into a new MOV container.
I like that you took no AI approach, i am looking for something like this i.e. understanding intent and generating command without using AI but so far regex based approaches have proved to be inadequate. I also tried indexing keywords and creating index of keywords with similar meaning that improved the situation a bit but without something heavy like bert its always a subpar experience.
Has anyone else been avoiding typing FFmpeg commands by using file:// URLs with yt-dlp
https://github.com/dheera/scripts/blob/master/helpme
This evolved from an ffmpeg wrapper I wrote before:https://news.ycombinator.com/item?id=40410637