jsonpost
JSON SchemaValidation

Validating JSON with JSON Schema

Learn how JSON Schema works, how to generate a schema from sample data, and how to validate documents with clear, path-based error messages.

JSONPost··6 min read
Validating JSON with JSON Schema

JSON Schema is a vocabulary for describing the shape of JSON data: which fields are required, what types they hold, and what values are allowed. It turns an informal "this is roughly what our API returns" into a precise, machine-checkable contract that both humans and code can rely on. Instead of discovering at runtime that a price came back as a string or a required id was missing, you catch the problem the moment the data crosses a boundary.

In this guide we'll look at what a schema is made of, how to generate one from real data, how to validate documents in code, and how the same schema can power your tests through mock data.

The anatomy of a schema

A JSON Schema is itself a JSON document. The most important keywords are type, properties, and required. Here's a small schema describing a user object:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "id": { "type": "integer" },
    "name": { "type": "string" },
    "email": { "type": "string", "format": "email" },
    "age": { "type": "integer", "minimum": 0 }
  },
  "required": ["id", "name", "email"],
  "additionalProperties": false
}

Reading it top to bottom: the value must be an object; it has four known properties with their own types; id, name, and email are required; and additionalProperties: false means any unexpected key is a validation error. That last line is the difference between "looks roughly right" and "matches the contract exactly."

Generating a schema from data

Writing schemas by hand gets tedious for large payloads, so it's often faster to start from a real example. Paste a representative response into the JSON Schema Generator and it infers types, required fields, and nested structures for draft-07 or 2020-12. Given this sample:

{
  "id": 42,
  "name": "Ada Lovelace",
  "roles": ["admin", "editor"],
  "address": { "city": "London", "zip": "EC1A" }
}

the generator produces a schema with an object for address and an array of string for roles. Treat the result as a strong first draft: tighten the required list, add format hints, and remove any fields that were only present by coincidence in your sample.

Constraining values

Types are only the beginning. JSON Schema can express precise rules about the values themselves. Strings support minLength, maxLength, pattern (a regular expression), and format for well-known shapes like email, uri, uuid, and date-time. Numbers support minimum, maximum, and multipleOf. Here is a schema for a product:

{
  "type": "object",
  "properties": {
    "sku": { "type": "string", "pattern": "^[A-Z]{3}-[0-9]{4}$" },
    "price": { "type": "number", "minimum": 0 },
    "currency": { "type": "string", "enum": ["USD", "EUR", "GBP"] },
    "tags": {
      "type": "array",
      "items": { "type": "string" },
      "minItems": 1,
      "uniqueItems": true
    }
  },
  "required": ["sku", "price", "currency"]
}

The enum keyword restricts currency to a fixed set of values, and the array constraints require at least one tag with no duplicates. These rules document your intent and reject bad data before it reaches your business logic.

Validating documents in code

Once you have a schema, you can validate any document against it. In JavaScript, the most popular validator is Ajv. The pattern is: compile the schema once, then run data through the compiled function.

import Ajv from "ajv";
import addFormats from "ajv-formats";

const ajv = new Ajv({ allErrors: true });
addFormats(ajv);

const validate = ajv.compile(schema);

const data = { sku: "abc", price: -5, currency: "JPY" };

if (!validate(data)) {
  for (const err of validate.errors) {
    console.log(`${err.instancePath || "(root)"} ${err.message}`);
  }
}

The allErrors option collects every problem instead of stopping at the first, and ajv-formats adds support for format keywords like email and date-time. For the invalid product above, the output is precise and path-based:

/sku must match pattern "^[A-Z]{3}-[0-9]{4}$"
/price must be greater than or equal to 0
/currency must be equal to one of the allowed values

That is far more useful than a generic "invalid input" message — it tells you exactly which field failed and why. If you just want to check a document quickly without writing any code, paste both into the JSON Schema Validator and you'll see the same path-based errors in your browser.

Composition and reuse

Real data is rarely a flat object. JSON Schema offers composition keywords — allOf, anyOf, oneOf, and not — to combine smaller schemas. A common use is a value that can take one of several shapes. For example, a payment that is either a card or a bank transfer:

{
  "oneOf": [
    {
      "properties": {
        "type": { "const": "card" },
        "last4": { "type": "string" }
      },
      "required": ["type", "last4"]
    },
    {
      "properties": {
        "type": { "const": "bank" },
        "iban": { "type": "string" }
      },
      "required": ["type", "iban"]
    }
  ]
}

To avoid repeating yourself, factor common definitions into a $defs section and reference them with $ref:

{
  "type": "object",
  "properties": {
    "billing": { "$ref": "#/$defs/address" },
    "shipping": { "$ref": "#/$defs/address" }
  },
  "$defs": {
    "address": {
      "type": "object",
      "properties": {
        "city": { "type": "string" },
        "zip": { "type": "string" }
      },
      "required": ["city"]
    }
  }
}

Now the address shape is defined once and reused for both billing and shipping.

A note on drafts

JSON Schema has evolved through several drafts. The two you'll meet most often are draft-07 and 2020-12. They're largely compatible, but a few keywords moved — definitions became $defs, and array tuple validation changed from items to prefixItems. Pick one draft for a project and declare it with the $schema keyword so validators behave predictably.

Generating mock data

A schema is more than a gatekeeper — it's also a blueprint for fake data. The Schema to Mock Data tool reads your schema and produces realistic sample objects that conform to it: emails that look like emails, dates that look like dates, and values within your declared ranges. This is invaluable for seeding tests, populating a staging environment, or building a frontend before the backend exists.

Putting it together

A productive workflow looks like this:

  1. Capture a real response and run it through the JSON Schema Generator.
  2. Refine the draft — tighten required, add format and range constraints, and set additionalProperties to false where you want strictness.
  3. Validate incoming data against the schema at your application boundaries and in CI, using a library like Ajv.
  4. Generate mock data from the same schema for unit and integration tests.

Because one schema drives validation, documentation, and test data, it stays the single source of truth for your data contract. Ready to start? Generate a schema from your own JSON with the JSON Schema Generator, then check a document against it in the JSON Schema Validator — both run entirely in your browser.

Keep reading