How to Format JSONL for Claude Fine-Tuning
Here is the fact almost nobody states clearly: Anthropic's own Claude API has no public, self-serve fine-tuning. Claude fine-tuning is a managed feature on Amazon Bedrock, and the documented JSONL format targets Claude 3 Haiku specifically. This guide covers the exact per-line shape Bedrock expects — the top-level system field, strict user/assistant alternation, the record and token quotas — plus the two competing format variants and how the whole thing differs from OpenAI. Validate every line against the JSONL validator before you submit a job.
The Fact Nobody States Clearly
Before any formatting detail: Anthropic''s own Claude API has no public, self-serve fine-tuning. If you came here expecting an endpoint on api.anthropic.com that takes a JSONL file, there isn''t one — at least not generally available the way OpenAI''s fine-tuning is.
The documented, generally-available way to fine-tune a Claude model is Amazon Bedrock''s model-customization feature, and the published JSONL format targets Claude 3 Haiku specifically. Newer Claude models — Sonnet 4.x, Opus 4.x, Haiku 4.5, Fable 5 — are not documented as fine-tunable. So when people search for "Claude fine-tuning format," what they almost always need is the Bedrock Claude 3 Haiku JSONL format.
This matters for two reasons. First, the format is dictated by AWS, not Anthropic, so the authoritative source is the AWS Bedrock model-customization documentation. Second, the model list and exact quotas can vary by region — always confirm against the AWS page for the model and region you''re targeting before you build a dataset.
What JSONL Is (and the Array Trap)
JSONL — JSON Lines — is one complete JSON object per line, separated by newlines. It is not a JSON array. This is the single most common formatting mistake.
Correct (each line is its own object):
{"system": "...", "messages": [...]}
{"system": "...", "messages": [...]}
{"system": "...", "messages": [...]}
Wrong (a single JSON array — Bedrock will reject this):
[
{"system": "...", "messages": [...]},
{"system": "...", "messages": [...]}
]
No enclosing brackets, no commas between lines, no trailing comma after the last line. Each line must parse as a standalone JSON object. If you''re generating the file from a list in code, write one object per line with a newline — don''t json.dumps the whole list.
The Claude / Bedrock Format (String Form)
This is the shape shown in AWS''s data-prep documentation for Claude 3 Haiku. Note three things: system is a top-level string, each message''s content is a string, and there is no schemaVersion field.
A single-turn example (one user question, one assistant answer):
{"system": "You are an helpful assistant.", "messages": [{"role": "user", "content": "what is AWS"}, {"role": "assistant", "content": "it''s Amazon Web Services."}]}
A multi-turn example is the same shape with more alternating turns:
{"system": "You are an helpful assistant.", "messages": [{"role": "user", "content": "what is AWS"}, {"role": "assistant", "content": "it''s Amazon Web Services."}, {"role": "user", "content": "what about S3?"}, {"role": "assistant", "content": "S3 is Amazon''s object storage service."}]}
Formatted across lines for readability (the real file keeps each example on one line):
{
"system": "You are an helpful assistant.",
"messages": [
{ "role": "user", "content": "what is AWS" },
{ "role": "assistant", "content": "it''s Amazon Web Services." }
]
}
The assistant''s final message is the training target — the thing the model learns to produce. Everything before it is the context the model conditions on.
The system Field
In the Claude/Bedrock format, the system prompt lives in a top-level system field, sitting beside messages rather than inside it. In the string form it''s a plain string:
{"system": "You are a terse SQL assistant. Return only the query.", "messages": [...]}
This is the structural opposite of OpenAI, where the system prompt is a message with "role": "system" at the front of the messages array. In Claude''s format, there is no system role inside messages — putting one there is a validation error. The roles inside messages are only user and assistant.
If every training example shares the same system prompt, repeat it on every line. The system field is per-example, so it can also vary line to line if you''re training behavior that depends on different system instructions.
Role and Alternation Rules
The messages array follows strict rules:
- Only two roles:
userandassistant. The system prompt is the separate top-level field, never a message role. - Must start with a
usermessage. The first turn is always the human side. - Roles must strictly alternate: user, assistant, user, assistant, … Two
userturns in a row (or twoassistantturns) is invalid. - Must end with an
assistantmessage. That final assistant turn is the training target. - Minimum 2 messages per example — at least one user/assistant pair.
So the smallest valid example is a single user message followed by a single assistant message. Multi-turn examples just extend the pattern: user → assistant → user → assistant, always starting user and ending assistant.
The Two Format Variants
You''ll encounter two shapes in AWS material, and it''s worth being honest about the ambiguity.
String form (doc-canonical for Claude 3 Haiku). Shown on AWS''s data-prep page. system is a string, content is a string, no schemaVersion:
{"system": "You are an helpful assistant.", "messages": [{"role": "user", "content": "what is AWS"}, {"role": "assistant", "content": "it''s Amazon Web Services."}]}
Converse / array form. Shown in some AWS best-practices blog content. system is an array of {"text": ...} objects, each message''s content is an array of text parts, and a schemaVersion field is present:
{
"schemaVersion": "bedrock-conversation-2024",
"system": [{ "text": "You are an helpful assistant." }],
"messages": [
{ "role": "user", "content": [{ "text": "what is AWS" }] },
{ "role": "assistant", "content": [{ "text": "it''s Amazon Web Services." }] }
]
}
How to tell which to use. The string form is the doc-canonical Claude 3 Haiku shape on the AWS data-prep page; the schemaVersion: "bedrock-conversation-2024" field belongs to the Converse format. They are not interchangeable, and you must not mix them inconsistently within a file (some lines string-content, some lines array-content). The honest answer is: follow the exact format shown on the AWS Bedrock page for your specific model and region, then validate every line before you submit. If the page for your model shows the string form, use the string form throughout; if it shows the Converse form with schemaVersion, use that throughout.
Quotas and Limits (Claude 3 Haiku on Bedrock)
From the AWS Bedrock model-customization documentation for Claude 3 Haiku:
- Minimum 32 training records.
- Maximum 10,000 training records.
- Maximum 1,000 validation records.
- Maximum 32,000 tokens per record.
- Max training dataset 10 GB; validation dataset 1 GB.
For planning, AWS suggests estimating roughly 6 characters per token — so a 32,000-token cap is on the order of ~192,000 characters per record, though real tokenization varies. To check a specific record against the per-record cap, count its tokens rather than relying on the character estimate. These numbers can change and can differ by region, so treat them as a snapshot and confirm against the current AWS page before building your dataset.
How It Differs from OpenAI
If you''ve formatted fine-tuning data for OpenAI, the differences are concentrated and easy to trip over:
- System prompt location. Claude/Bedrock puts it in a top-level
systemfield. OpenAI uses a message with"role": "system"inside the messages array. - Allowed roles. Claude messages are
user/assistantonly. OpenAI mixessystem/user/assistantall insidemessages. - Strict alternation. Claude must start with
userand end withassistant, strictly alternating. OpenAI is more lenient about ordering. - No per-message weight. Claude''s format has no per-message
weightfield; OpenAI supportsweightto include or exclude an assistant message from the loss. - Higher minimum. Claude 3 Haiku on Bedrock requires a minimum of 32 examples vs OpenAI''s 10.
See the companion guide on formatting JSONL for OpenAI fine-tuning for the OpenAI side in full.
Validation Pitfalls Checklist
Run through these before you upload. Most failed Bedrock jobs trace back to one of them:
- JSON array instead of JSONL. One object per line, no enclosing
[ ], no commas between lines. - Trailing commas. Illegal in JSON. (AWS''s own multi-turn doc example has shipped with an illegal trailing comma — don''t copy that part.)
- Single quotes. JSON requires double quotes for keys and string values.
- BOM or blank lines. A UTF-8 byte-order mark at the file start, or empty lines between records, will break parsing.
- System prompt in a message. The system prompt is the top-level field, not a
messagesentry. - Not starting with
user/ not ending withassistant. First turn must be user; last turn must be assistant (the target). - Non-alternating roles. No two same-role turns back to back.
- Mixing string-content and array-content forms. Pick one variant for the whole file; don''t interleave.
- Over the token cap. Each record must stay under 32,000 tokens.
Try It Live
The JSONL validator checks each line of your file independently — it catches the array-instead-of-JSONL trap, trailing commas, single quotes, BOMs, and blank lines before Bedrock ever sees them. Validation runs in your browser, so you can check sensitive training data without uploading it anywhere. Pair it with the token counter to confirm each record stays under the 32,000-token per-record cap before you submit a fine-tuning job.
Frequently Asked Questions
Can I fine-tune Claude through the Anthropic API directly?
Not through a public, self-serve endpoint. As of this writing, Anthropic's own Claude API does not offer self-serve fine-tuning the way OpenAI does. The documented, generally-available path to fine-tuning a Claude model is Amazon Bedrock's model-customization feature, and the published JSONL format targets Claude 3 Haiku. Newer models (Sonnet 4.x, Opus 4.x, Haiku 4.5, Fable 5) are not documented as fine-tunable. If you need to customize Claude, plan around Bedrock and Claude 3 Haiku, and check the AWS Bedrock model-customization documentation for the current list of supported models in your region.
Where does the system prompt go in Claude fine-tuning JSONL?
In a top-level system field on each line — not inside the messages array. This is the single biggest structural difference from OpenAI, where the system prompt is a message with "role": "system". In the Bedrock Claude 3 Haiku format the line looks like {"system": "You are a helpful assistant.", "messages": [...]}, and the messages array contains only user and assistant turns. Putting a system-role message inside messages is a validation error.
How many training examples does Bedrock require to fine-tune Claude 3 Haiku?
AWS Bedrock requires a minimum of 32 training records and allows a maximum of 10,000 training records, plus up to 1,000 validation records. Each record is capped at 32,000 tokens (AWS suggests estimating roughly 6 characters per token for planning). The full training dataset can be up to 10 GB and the validation dataset up to 1 GB. Note the 32-example floor is higher than OpenAI's 10 — confirm the exact current limits on the AWS Bedrock model-customization page for your region before you build the dataset.
Why are there two different Claude JSONL formats I see online?
Because two shapes circulate in AWS material. The data-prep documentation for Claude 3 Haiku shows the simple string form: system is a plain string and each message's content is a plain string, with no schemaVersion field. Some AWS best-practices blog content shows the Converse-style form instead: system is an array of {"text": ...} objects, content is an array of text parts, and a "schemaVersion": "bedrock-conversation-2024" field is present. The string form is the doc-canonical Haiku shape; the schemaVersion field belongs to the Converse format. Don't mix them — follow the exact format shown on the AWS page for your specific model and region, and validate before submitting.
What are the most common mistakes in Claude fine-tuning JSONL?
The frequent ones: submitting a single JSON array instead of one JSON object per line (JSONL is line-delimited, not an array); trailing commas (illegal in JSON — AWS's own multi-turn doc example has even shipped with one); single quotes instead of double quotes; a UTF-8 BOM or blank lines in the file; putting the system prompt inside the messages array; not starting with a user message or not ending with an assistant message; non-alternating roles (two user turns in a row); and inconsistently mixing the string-content and array-content forms. Run each line through a JSONL validator first to catch the syntax-level errors before Bedrock rejects the whole job.