Hacker News

Moebius: 0.2B image inpainting model with 10B-level performance

114 points by DSemba - 20 comments

delis-thumbs-7e [3 hidden]5 mins ago

This is the useful AI stuf. There’s so many usecases this makes possible.

doctorpangloss [3 hidden]5 mins ago

how many times have you edited a photo you took on your phone in the last 7 days?

GL26 [3 hidden]5 mins ago

Could this run locally on a smartphone ?

teroshan [3 hidden]5 mins ago

Unrelated but when I read inpainting and Moebius I was scared it was related and using the art of the great Jean Giraud [0] a.k.a. Moebius

https://characterdesignreferences.com/artist-of-the-week-3/m...

[0] https://en.wikipedia.org/wiki/Jean_Giraud

coldtea [3 hidden]5 mins ago

Scared why?

teroshan [3 hidden]5 mins ago

Scared for the same reason I found last year's 'Ghibli filter' craze upsetting, I would have personally hated to have seen this artist's legacy used for promoting AI image generation.

TeMPOraL [3 hidden]5 mins ago

In case that happened then the rest of the world would probably appreciate the art, and a subset of it, the artist (and even a small subset of ~whole Internet-connected population is a lot of people). Some silver lining, perhaps.

solid_fuel [3 hidden]5 mins ago

> In case that happened then the rest of the world would probably appreciate the art

What art?

We’re talking about generated pictures, aka slop, not art made by a real human.

And I don’t know if you’ve been paying attention but people seem to be pretty tired of the slop. I don’t think it would be appreciated nearly as much as you think.

TeMPOraL [3 hidden]5 mins ago

This definition of "slop" doesn't cut reality just quite at the joints.

People are tired of marketing. AI generated slop people are annoyed with, is garbage produced for marketing reasons, and it's distinctly noticeable precisely because all the bottom-feeder marketing houses switched to using it. But it's not the AI itself that's the problem here. Slop was here before, but it was made with cheap protein-based image generators. Silicon-based generators are just cheaper.

NooneAtAll3 [3 hidden]5 mins ago

I don't understand. Is it available somewhere to try or is it just an ad?

owebmaster [3 hidden]5 mins ago

Yeah it's great but how do I use it?

Edit: I think I found it https://huggingface.co/hustvl/Moebius

K0IN [3 hidden]5 mins ago

with this size we could have a interaactive web demo.

N_Lens [3 hidden]5 mins ago

The gallery of their samples is pretty impressive!

epolanski [3 hidden]5 mins ago

What is the current SOTA for impainting?

I have a potential project for my e-commerce where I want to allow users to upload images of their house exteriors and impaint awnings.

TeMPOraL [3 hidden]5 mins ago

Awnings, if I understand correctly (I just learned this word right now), are purely additive attachments to structure exteriors - so perhaps they wouldn't necessarily need a full inpainting model? Wouldn't it be enough to estimate an affine transform for a quad and blend the image of awning directly (and the same with shadow map to fake shade)? Is classical photogrammetry up to such task these days?

vunderba [3 hidden]5 mins ago

Proprietary? Either gpt-image-2 or NB2.

I have an example of interior decorating inpainting where I replaced a large floor-to-ceiling window with a mirror, and the result was pretty impressive using NB Pro from nearly a year ago.

https://imgpb.com/ZXkiXV

Locally hostable? For my money I'd argue Flux.2 Klein but Qwen-Edit still puts in the work.

CharlesW [3 hidden]5 mins ago

NB2 means "Nano Banana 2", a Google image generation model. https://blog.google/innovation-and-ai/technology/ai/nano-ban...

IAmGraydon [3 hidden]5 mins ago

As far as I know, gpt-image-2 doesn't even let you define a mask unless you've already run it through one iteration, and once you do define the mask, it just ignores it 90% of the time. It's utterly useless for inpainting. Also, this and other proprietary models are severely limited in their output resolution.

I do agree, however, that the Flux2 family is the SoTA at the moment. Running locally via something like Comfy gets incredible results.

BoredPositron [3 hidden]5 mins ago

flux klein with LoRa. GPT image and nano often produce high frequency artifacts when editing.

zb3 [3 hidden]5 mins ago

1) What are RAM requirements?

2) If these are reasonable, a WebGPU demo would be great..