Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The model learns all attributes of the images it's trained on, including that some have a watermark. The fact that it generates a watermark in some images doesn't mean that that is a 1:1 image from the training set, it just means to the model some images seem to have a watermark, so it will add it sometimes. Often you can just add "no watermark" (or add it as a negative prompt with some weights) and re-use the same seed to get the same image without the watermark.


It may or may not be a 1:1 image, but I think it's significant that in both cases, with different seeds, what is directly behind / right of the watermark is a pretty similar building with different distortions applied to it. I'm not sure what the difference is between "learning" from a particular image and encoding that image with a lot of compression, when in either case the usage more or less reliably reconstructs the image algorithmically.

If I have a photographic memory and I memorize the Coca Cola logo and then draw it into a commercial work by decoding the firing of my neurons into muscle movements, the storage and retrieval method I used has no bearing on whether I infringed on their copyright.


No, it means that it is reproducing the original work and is not producing a new original work. It is basically a really fancy Instagram lense, but it is still 100% derived from the underlying works and therefor derivative instead of a newly created non-derivative work.


I'm not sure how you can make this argument just based on the model synthezing a watermark that is has learned about in the original dataset. Don't forget, the model is only 4GB in size, and while it's not out of the question that it could regurgitate an image from its data set, considering the size of the training which is a few magnitudes larger it is highly unlikely.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: