Mock Data Generation: Building Realistic Test Fixtures for Your Apps

Why Mock Data Quality Matters

Bad test data is a hidden source of production bugs. Using unrealistic data ("foo", "bar", "test@test.com") in your fixtures means:

Edge cases involving special characters, long strings, or unusual values go untested
UI components look wrong in demos and reviews
Data truncation and overflow bugs slip through
Internationalization issues with names, addresses, and dates are never caught

Realistic mock data catches real bugs. A name like "Nguyễn Thị Trang" tests UTF-8 handling. An address like "St. John's Church Road" tests apostrophe escaping. A phone number like "+1 (555) 867-5309" tests format handling.

Faker.js: The Standard Library

Faker.js (@faker-js/faker) is the most widely used mock data library in the JavaScript ecosystem, with 70+ locales and hundreds of data types.

npm install --save-dev @faker-js/faker

import { faker } from "@faker-js/faker"

// Basic data types
faker.person.fullName()           // "Olivia Martinez"
faker.internet.email()            // "o.martinez@gmail.com"
faker.internet.url()              // "https://smart-hat.io"
faker.phone.number()              // "+1 555-234-8901"
faker.location.streetAddress()    // "142 Oak Avenue"
faker.company.name()              // "TechFlow Systems Inc."
faker.lorem.paragraph()           // "Lorem ipsum..."
faker.date.past({ years: 2 })     // Date object 0-2 years ago
faker.finance.amount()            // "128.47"

// IDs and technical data
faker.string.uuid()               // "a8098c1a-f86e-11da-bd1a-00112444be1e"
faker.internet.ipv4()             // "192.168.1.105"
faker.internet.mac()              // "1a:2b:3c:4d:5e:6f"
faker.color.rgb()                 // "#c8f0e2"

Seeded Randomness: Reproducible Tests

The most important Faker feature most developers ignore: seed the random number generator so tests are reproducible.

import { faker } from "@faker-js/faker"

// Set a seed — same seed always produces the same data
faker.seed(12345)

const user = {
  name: faker.person.fullName(),   // Always "Bret Rohan" with seed 12345
  email: faker.internet.email(),
  id: faker.string.uuid()
}

// Reset seed in beforeEach for isolated test suites
beforeEach(() => faker.seed(42))

Without seeding, a flaky test might fail 1 in 1000 runs because the random data triggered an edge case — and you can never reproduce it.

Building Factory Functions

Factory functions generate consistent object shapes. Pair them with overrides for flexibility:

import { faker } from "@faker-js/faker"

function createUser(overrides = {}) {
  return {
    id: faker.string.uuid(),
    firstName: faker.person.firstName(),
    lastName: faker.person.lastName(),
    email: faker.internet.email(),
    role: faker.helpers.arrayElement(["admin", "editor", "viewer"]),
    createdAt: faker.date.past({ years: 1 }).toISOString(),
    active: faker.datatype.boolean({ probability: 0.85 }),
    ...overrides   // Caller can override any field
  }
}

function createUsers(count, overrides = {}) {
  return Array.from({ length: count }, () => createUser(overrides))
}

// Usage in tests
const adminUser = createUser({ role: "admin", active: true })
const users = createUsers(50)

Locale-Specific Data

import { fakerDE, fakerJA, fakerVI } from "@faker-js/faker"

// German user
fakerDE.person.fullName()        // "Klaus-Dieter Müller"
fakerDE.location.city()          // "Frankfurt am Main"

// Japanese user
fakerJA.person.fullName()        // "田中 健一"

// Vietnamese user
fakerVI.person.fullName()        // "Nguyễn Minh Tuấn"

Always test with non-ASCII names if your app serves international users. Column widths, sorting, and text rendering all behave differently with CJK characters.

Relational Mock Data

Real data has relationships. Naive random generation breaks foreign keys:

function createBlogData() {
  // Create authors first
  const authors = createUsers(5)

  // Create posts referencing valid author IDs
  const posts = Array.from({ length: 20 }, () => ({
    id: faker.string.uuid(),
    title: faker.lorem.sentence(),
    authorId: faker.helpers.arrayElement(authors).id,   // Valid FK
    publishedAt: faker.date.past({ years: 1 }).toISOString(),
    tags: faker.helpers.arrayElements(["js", "css", "node", "react", "api"], { min: 1, max: 3 })
  }))

  return { authors, posts }
}

Edge Case Data

Always include edge case fixtures alongside normal data:

const edgeCaseUsers = [
  createUser({ firstName: "O'Brien", lastName: "Smith-Jones" }),  // Apostrophe, hyphen
  createUser({ firstName: "José", lastName: "García" }),           // Accented chars
  createUser({ firstName: "A", lastName: "X" }),                   // Very short
  createUser({ firstName: faker.string.alpha(50) }),               // Very long
  createUser({ email: "test+tag@sub.domain.example.com" }),        // Complex email
]

For quick, in-browser mock data generation without any setup, use HeoLab's Mock Data Generator to create realistic JSON arrays with configurable fields.

Conclusion

Great mock data uses realistic values, locale-appropriate formats, seeded randomness for reproducibility, and explicit edge cases. Factory functions with overrides give you the flexibility to create specific scenarios without duplicating boilerplate. The investment in good test fixtures pays off every time a bug is caught before it reaches production.

Why Mock Data Quality Matters

Bad test data is a hidden source of production bugs. Using unrealistic data ("foo", "bar", "test@test.com") in your fixtures means:

Edge cases involving special characters, long strings, or unusual values go untested
UI components look wrong in demos and reviews
Data truncation and overflow bugs slip through
Internationalization issues with names, addresses, and dates are never caught

Faker.js: The Standard Library

Faker.js (@faker-js/faker) is the most widely used mock data library in the JavaScript ecosystem, with 70+ locales and hundreds of data types.

npm install --save-dev @faker-js/faker

import { faker } from "@faker-js/faker"

// Basic data types
faker.person.fullName()           // "Olivia Martinez"
faker.internet.email()            // "o.martinez@gmail.com"
faker.internet.url()              // "https://smart-hat.io"
faker.phone.number()              // "+1 555-234-8901"
faker.location.streetAddress()    // "142 Oak Avenue"
faker.company.name()              // "TechFlow Systems Inc."
faker.lorem.paragraph()           // "Lorem ipsum..."
faker.date.past({ years: 2 })     // Date object 0-2 years ago
faker.finance.amount()            // "128.47"

// IDs and technical data
faker.string.uuid()               // "a8098c1a-f86e-11da-bd1a-00112444be1e"
faker.internet.ipv4()             // "192.168.1.105"
faker.internet.mac()              // "1a:2b:3c:4d:5e:6f"
faker.color.rgb()                 // "#c8f0e2"

Seeded Randomness: Reproducible Tests

The most important Faker feature most developers ignore: seed the random number generator so tests are reproducible.

import { faker } from "@faker-js/faker"

// Set a seed — same seed always produces the same data
faker.seed(12345)

const user = {
  name: faker.person.fullName(),   // Always "Bret Rohan" with seed 12345
  email: faker.internet.email(),
  id: faker.string.uuid()
}

// Reset seed in beforeEach for isolated test suites
beforeEach(() => faker.seed(42))

Without seeding, a flaky test might fail 1 in 1000 runs because the random data triggered an edge case — and you can never reproduce it.

Building Factory Functions

Factory functions generate consistent object shapes. Pair them with overrides for flexibility:

import { faker } from "@faker-js/faker"

function createUser(overrides = {}) {
  return {
    id: faker.string.uuid(),
    firstName: faker.person.firstName(),
    lastName: faker.person.lastName(),
    email: faker.internet.email(),
    role: faker.helpers.arrayElement(["admin", "editor", "viewer"]),
    createdAt: faker.date.past({ years: 1 }).toISOString(),
    active: faker.datatype.boolean({ probability: 0.85 }),
    ...overrides   // Caller can override any field
  }
}

function createUsers(count, overrides = {}) {
  return Array.from({ length: count }, () => createUser(overrides))
}

// Usage in tests
const adminUser = createUser({ role: "admin", active: true })
const users = createUsers(50)

Locale-Specific Data

import { fakerDE, fakerJA, fakerVI } from "@faker-js/faker"

// German user
fakerDE.person.fullName()        // "Klaus-Dieter Müller"
fakerDE.location.city()          // "Frankfurt am Main"

// Japanese user
fakerJA.person.fullName()        // "田中 健一"

// Vietnamese user
fakerVI.person.fullName()        // "Nguyễn Minh Tuấn"

Always test with non-ASCII names if your app serves international users. Column widths, sorting, and text rendering all behave differently with CJK characters.

Relational Mock Data

Real data has relationships. Naive random generation breaks foreign keys:

function createBlogData() {
  // Create authors first
  const authors = createUsers(5)

  // Create posts referencing valid author IDs
  const posts = Array.from({ length: 20 }, () => ({
    id: faker.string.uuid(),
    title: faker.lorem.sentence(),
    authorId: faker.helpers.arrayElement(authors).id,   // Valid FK
    publishedAt: faker.date.past({ years: 1 }).toISOString(),
    tags: faker.helpers.arrayElements(["js", "css", "node", "react", "api"], { min: 1, max: 3 })
  }))

  return { authors, posts }
}

Edge Case Data

Always include edge case fixtures alongside normal data:

const edgeCaseUsers = [
  createUser({ firstName: "O'Brien", lastName: "Smith-Jones" }),  // Apostrophe, hyphen
  createUser({ firstName: "José", lastName: "García" }),           // Accented chars
  createUser({ firstName: "A", lastName: "X" }),                   // Very short
  createUser({ firstName: faker.string.alpha(50) }),               // Very long
  createUser({ email: "test+tag@sub.domain.example.com" }),        // Complex email
]

For quick, in-browser mock data generation without any setup, use HeoLab's Mock Data Generator to create realistic JSON arrays with configurable fields.

Why Mock Data Quality Matters

Faker.js: The Standard Library

Seeded Randomness: Reproducible Tests

Building Factory Functions

Locale-Specific Data

Relational Mock Data

Edge Case Data

Conclusion

Try These Tools

Related Articles

Mock Data Generation: Building Realistic Test Fixtures for Your Apps

Why Mock Data Quality Matters

Faker.js: The Standard Library

Seeded Randomness: Reproducible Tests

Building Factory Functions

Locale-Specific Data

Relational Mock Data

Edge Case Data

Conclusion

Try These Tools

Related Articles