Skip to main content

Apple's New AI Offers Image Editing With Natural Language Prompts

 ChatGPT made Large Language Models one of the most cutting edge types of technology out there, but in spite of the fact that this is the case, we are already seeing the rise of MLLMs, or Multimodal Large Language Models, that can process images as well as text. Apple has just released its own MLLM dubbed MGIE, and it might represent the next step forward in the AI race with all things having been considered and taken into account.


The main thing that sets MGIE apart is its ability to edit images based on natural language instructions. Prompts don’t have to be delivered in a way that would be interpreted solely by an AI, but rather with normal everyday language, similar to the instructions one would give to a human image editor.

With all of that having been said and now out of the way, it is important to note that MGIE uses its MLLM to translate plain language into more technical instructions. For example, if a user were to give the instruction to make the sky in a particular picture a deeper shade of blue, MGIE will translate this into an instruction that asks to increase the saturation of a particular region by 20% or so.

On top of all of that, MGIE leverages its distinct end to end training scheme to create a latent representation of the end result that the user is looking for, which is referred to as visual imagination, and it subsequently derives the core of the instructions to edit each and every pixel accordingly. Such precision can be enormously useful because of the fact that this is the sort of thing that could potentially end up allowing edits to be made far faster than might have been the case otherwise.

MGIE can optimize photos, edit them, manipulate them or do anything else that a user might end up requiring. It is currently available as an open source model on GitHub, allowing users from around the world to take advantage of this AI breakthrough that Apple has made in collaboration with the University of California.

Photo: Digital Information World - AIgen

Comments

Popular posts from this blog

Apple In The Hotseat After Reviewer Confirms Its Vision Pro’s Eyesight Feature Doesn’t Work

  For months, we’ve seen tech giant Apple speak about how its Vision Pro entails features that set it far apart from all others in the industry. Now, a reviewer is casting serious doubt on the iPhone maker’s claims after adding that one of the key features of the new Vision Pro Eyesight does not work. And that’s shocking considering how much Apple has marketed the product as one of the best in the industry. When you consider a wide array of real-life examples, you’ll find how Apple has always spoken about this technology being one of the best out there. But in reality, one reviewer says that’s far from the truth. CEO Tim Cook took out the time to argue about how AR is far more superior and entertaining than the world of VR. The former was better as it did not isolate individuals from the community arising around it. Moreover, this is where the entire EyeSight product range came into existence from this notion as it ensured users were well aware and engaged in everything in their su...

OpenAI Sets Eyes On New AI Project Worth Trillions As Sam Altman Begins Talks With Potential Investors

  OpenAI has made it very clear that it’s not coming slow in terms of its ambitions for the year 2024. Sam Altman is said to be in talks with leading investors including the UAE government for a massive AI project that’s said to be worth trillions. This would entail the production of AI chips as confirmed by the WSJ in a new report. The CEO has yet to unveil the curtain on what exactly the project is all about and how it’s only in the early stages. Meanwhile, the list of investors taking part in this ordeal is still unknown, the company explained. Sam Altman also held similar discussions regarding the raising of funds for plans such as the production of a new plant with Japan and UAE-based investors such as their leading tech giants which include SoftBank Group and G42. It’s not too shocking as we’ve heard the OpenAI CEO mention time and time again how the world needs more AI chips now than ever. He feels they are designed to better enhance performance and assist in running AI mode...

The Arrival Of Gemini 1.5 - Google Unveils Its Latest Iteration Of Its Conversational AI System

  Google just unveiled  Gemini 1.5 , its latest rendition of the conversational AI system. The product is said to entail a greater array of advances in better efficiency, long-form reasoning, and enhanced performance. The latest system was detailed in a post by Google’s AI head that entailed a large figure of architecture enhancements, ensuring the core model can perform on the same level as the big Gemini 1.0 Ultra endeavor, without using extra computing resources. This latter was rolled out in the past week. The biggest leap comes at a time when there’s a huge window for carrying out experiments that the company says have to do with long-form context comprehension. The standard model of Gemini analyzes several prompts within a small 128k token context. With this new upgrade, the model will have a large number of data to process which can now be done quicker than before. This huge leap arose at a time when we saw the firm’s CEO analyze and classify as well as summarize a huge...