yeah, there will always be some latency, but how much depends on your workflow and the model you’re using.
have you tried GPT-4 Turbo or Gemini 2.0 Flash? they’re some of the fastest models available and should help reduce the delay (at least to some degree)