xAI: Grok Build 0.1

Provided by OpenRouter

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output, and is optimized for interactive coding...

Specifications

Context Length

256,000 tokens

Input Price

$1.00/M

Output Price

$2.00/M

Vision Support

Yes

Capabilities

TextVision

About xAI: Grok Build 0.1

Strengths

•Multimodal understanding - can process text and images
•Large context window (256k tokens) for long conversations

Use Cases

•Image and document understanding
•Content creation and writing assistance
•General conversations and Q&A

Limitations

Performance may vary based on query complexity, context length, and task type. Consider using higher-tier models for production-critical applications.

Sample Prompts

Try these prompts to explore xAI: Grok Build 0.1's capabilities:

Analyze this image and describe what you see in detail

Extract the key information from this screenshot

Compare the two images and explain the differences

Tip: Customize these prompts to fit your specific needs and use cases.

Credits required

xAI: Grok Build 0.1 uses tiered credit pricing. Subscribe for a monthly credit allowance, connect your own provider API key (BYOK), or browse lower-cost models on the catalog.

Credit cost per message is shown in the model picker. Economy models typically cost 1 credit; frontier models cost more.

Related Models

Similar models you might be interested in

Qwen: Qwen3.7 Plus

Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, building on the series' text capabilities with a comprehensive upgrade to its...

MiniMax: MiniMax M3

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...

StepFun: Step 3.7 Flash

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

Perceptron: Perceptron Mk1

Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and video inputs paired with natural language queries, and produces detailed visual understanding...