I will delete all of the illegal pictures and tell customer to fill the blanks in with whatever they want.
I could be proved wrong, but this sounds like an incomplete product. I'm not sure trying to sell speech therapy resource books missing what sounds like a key component that the buyer now has to provide is going to market and sell well. These aren't coloring books. Statements like this also reinforce the lazy, corner cutting, cheap nature that keeps cropping up in this thread with your work. Could sales be soft because the sellable materials come across as cheap, unprofessional and lazy to others?
The idea and concept you're doing and pushing forward in theory is fantastic, but for all the work done on the foundation of presentation, you've literally been cheaping out on the most pivotal parts of making it all work, the part that you can't create yourself. I think that may be hurting you far more than you realize, especially when it's a niche market in the first place.
But - will I need to replace each of my illegal sound label pics with the licensed ones? Is there some difference between the two that triggers the artist/company into thinking that I don't have a license.
In the end I'm likely to have obtained pictures from many different sources. What do I need to do to show that I obtained them legally?
I think you're still grossly misunderstanding the failure point and what's required to properly license the images used. If through some miraculous coincidence, some of the images you've used
can be licensed through, say, Getty Images? You specifically license those images through Getty on your account and you can probably leave those pre-existing content assets unchanged, unless you want to standardize image quality in the work using their print quality uncompressed image files and vector graphic masters (which will go a long way towards making a more polished and professional looking product).
If you cannot get a proper license through Getty or (anyone else) for certain image assets, you simply CANNOT USE THEM, full stop. This means you need to source a different image asset with a proper license that is similar in nature to the one you cannot use.
Perhaps these examples will help you better understand:
You found a photo of the Great Pyramid in Giza from a unique angle online taken by Dee Snoots that you wanted to use. Dee Snoots holds exclusive copyright on this specific photo but has licensed it for royalty free usage through iStockPhoto/Getty Images. This means you can use Dee Snoots' photo by buying a license to use this image legally through iStockPhoto/Getty Images in your project. Problem solved, you get the exact photo you want for usage with your work!
You also found a photo of Big Ben tower online taken by Joe Schmidlap with a sunset sky in the background that you wanted to use. Joe Schmidlap holds exclusive copyright on this photo and has not licensed it for royalty free usage anywhere and refuses to give you permission to use it. This means you can't use Joe Schmidlap's photo of Big Ben tower in your project. However, iStockPhoto/Getty Images has a daytime photo of Big Ben tower with fluffy blue clouds behind it from a different angle also taken by Dee Snoots who has licensed their copyright through Getty for royalty free reproduction. So, you buy a license for that work through Getty to use Dee's photo instead of using Joe Schmidlap's in your work. Now, it's legal. It may not be the exact image you wanted, but it's close enough to substitute.
I would highly recommend if you're going to use pre-existing generated content and properly license those assets, try to source everything from the same licensing company. It makes for far easier records keeping as you can literally point to one company for all license records on the artwork used.
The above is a legal theory put forward by some folks, including the New York Times (although they are arguing it with text).
It is not yet a matter of settled law and it's unclear how the courts are going to rule on whether or not AI models constitute a transformative use which would be covered under fair use.
Here's a good review of the current state of legal thinking (and open questions) in this area in the Columbia Journalism Review.
There's an interesting article on this over at Ars that just dropped yesterday.
Given the nature of how the machines work, they're just regurgitating parts of what was put in in an aesthetically pleasing way. Everything it creates is derivative of the data it's trained on. The matter is complicated, but it's doubtful the Google Books defense is gonna work on this one for text or image output. Although it's still not settled, it's pretty obvious the direction the law is going to fall. We have ridiculously strong copyright laws and frown heavily on derivative works based on copyrighted materials. These machines have no free will or spark of creativity, it's literally a GIGO machine in that it can only spit out what its trained on.
There are two camps in AI in regard to this... the AI True Believers™ who have drunk the Flavorade and deluded themselves into believing these fuzzy heuristics machines have the divine spark of thought and creativity and/or view humanity as stupid stochastic parrots incapable of creativity and easily replaced by machines, and are as such preparing for the singularity apocalypse with their massive wilderness bunkers wrought at their own hands with their technology while telling us all not to worry, because they're open and ethical and they got Red Teams on it to keep these infernal machines from killing us all! And it's important we keep our focus on the
machines for creating this croney-capitalist hellscape, because it can never be the capital owners who created these machines who are at fault for making them in the first place... after all, it's all a black box, they don't even know how it works anymore! These are the idiots betting on being able to beat the copyright challenges in the courts right now, and their "we'll defend you from copyright infringement lawsuits" pinky swears with their generated content have no meat in the boilerplate or financial backing promised. Meanwhile, their lawyers are still secretly scrambling and making deals with large content repositories to license those works to avoid paying out when the dust settles off of these lawsuits.
The other camp using AI generation for artworks is using properly licensed databases, calling a spade a spade, and recognizing that it's producing uncopyrightable content due to the derivative nature of the machine. These people are putting up the same copyright infringement protections with real dollar amounts in real contracts on their AI generated content that they're putting behind their properly licensed works.
This is why AI generated artwork cannot be copyrighted.
This is incorrect. The reason AI generated artwork cannot be copyrighted is that the US Copyright Office does not consider them "the product of human authorship" which is a key criteria for copyrights to be granted.
Yes, true, but this was not the perspective I was taking with it. I'm not talking about AI models' inability to register
their own copyrights. I'm talking about the likely consequence of those lawsuits shaking out with the inability for AI users to take purely synthesized AI content generated whole-stock and unmodified from their prompts and copyright it for their own use, because it'll be legally viewed as derivative content free from the protection of fair use. Given it's a pretty reasonable conclusion that the courts and copyright holders are gonna laugh the AI companies out of the courtroom with their, "We should totally get a free pass on fair use because it's 'academic usage' to teach these machines that we're then flipping over to use to generate content for profit!" defense, there's not much reason to expect otherwise... especially when it's been proven time and again that those copyrighted works are clearly in existence, as a whole in the datasets. Text. Visual. You name it.
All this, however, is irrelevant to OPs problem, outside of us both proving from all ends that his solution to the problem is not to be found in AI, but paying artists for the work he needs to complete his own sell-able works in a way that doesn't leave him destitute after a few days playing lawyerball in front of a dude in a black dress and twelve angry people.