Consistency LLM: converting LLMs to parallel decoders accelerates inference 3.5x 40 by zhisbug | 3 comments
Post a Comment