ChatGPT made Large Language Models one of the most cutting edge types of technology out there, but in spite of the fact that this is the case, we are already seeing the rise of MLLMs, or Multimodal Large Language Models, that can process images as well as text. Apple has just released its own MLLM dubbed MGIE, and it might represent the next step forward in the AI race with all things having been considered and taken into account. The main thing that sets MGIE apart is its ability to edit images based on natural language instructions. Prompts don’t have to be delivered in a way that would be interpreted solely by an AI, but rather with normal everyday language, similar to the instructions one would give to a human image editor. With all of that having been said and now out of the way, it is important to note that MGIE uses its MLLM to translate plain language into more technical instructions. For example, if a user were to give the instruction to make the sky in a par...