Text-to-Image AI
You describe the image you want, and the model produces it from scratch. The model is Z-Image-Turbo, an open-source diffusion model with a focus on speed and prompt adherence.
- 1
Prompt parsing
Your text is tokenized and conditioned on the model. Words like "soft lighting", "35mm", or "watercolor" steer the output toward specific visual styles.
- 2
Latent diffusion
Starting from random noise, the model iteratively denoises the latent representation, guided by your prompt and the chosen aspect ratio. Higher steps = cleaner details, lower steps = faster output.
- 3
Decode and post-process
The latent is decoded into a full-resolution image. A safety filter runs once, the result is uploaded, and the URL is returned to your browser.
Output: A 1024-pixel PNG, square, portrait, or landscape — yours to keep.