Skip to content

Mock Http and LLM servers, inspired by wiremock, but with response streaming and SSE

License

Notifications You must be signed in to change notification settings

kpavlov/ai-mocks

Repository files navigation

Mokksy and AI-Mocks

Maven Central Kotlin CI Codacy Badge Codacy Coverage codecov Api Docs

Mokksy and AI-Mocks are mock HTTP and LLM (Large Language Model) servers inspired by WireMock, with support for response streaming and Server-Side Events (SSE). They are designed to build, test, and mock OpenAI API responses for development purposes.

Mokksy

mokksy-mascot-256.png

Mokksy is a mock HTTP server built with Kotlin and Ktor. It addresses the limitations of WireMock by supporting true SSE and streaming responses, making it particularly useful for integration testing LLM clients.

Core Features

  • Flexibility to control server response directly via ApplicationCall object.
  • Built with Kotest Assertions.
  • Fluent modern Kotlin DSL API.
  • Support for simulating streamed responses and Server-Side Events (SSE) with delays between chunks.
  • Support for simulating response delays.

Example Usages

Responding with Predefined Responses

// given
val expectedResponse =
  // language=json
  """
    {
        "response": "Pong"
    }
    """.trimIndent()

mokksy.get {
  path = beEqual("/ping")
  containsHeader("Foo", "bar")
} respondsWith {
  body = expectedResponse
}

// when
val result = client.get("/ping") {
  headers.append("Foo", "bar")
}

// then
assertThat(result.status).isEqualTo(HttpStatusCode.OK)
assertThat(result.bodyAsText()).isEqualTo(expectedResponse)

POST Request

// given
val id = Random.nextInt()
val expectedResponse =
  // language=json
  """
    {
        "id": "$id",
        "name": "thing-$id"
    }
    """.trimIndent()

mokksy.post {
  path = beEqual("/things")
  bodyContains("\"$id\"")
} respondsWith {
  body = expectedResponse
  httpStatus = HttpStatusCode.Created
  headers {
    // type-safe builder style
    append(HttpHeaders.Location, "/things/$id")
  }
  headers += "Foo" to "bar" // list style
}

// when
val result =
  client.post("/things") {
    headers.append("Content-Type", "application/json")
    setBody(
      // language=json
      """
            {
                "id": "$id"
            }
            """.trimIndent(),
    )
  }

// then
assertThat(result.status).isEqualTo(HttpStatusCode.Created)
assertThat(result.bodyAsText()).isEqualTo(expectedResponse)
assertThat(result.headers["Location"]).isEqualTo("/things/$id")
assertThat(result.headers["Foo"]).isEqualTo("bar")

Server-Side Events (SSE) Response

Server-Side Events (SSE) is a technology that allows a server to push updates to the client over a single, long-lived HTTP connection, enabling real-time updates without requiring the client to continuously poll the server for new data.

mokksy.post {
  path = beEqual("/sse")
} respondsWithSseStream {
  flow =
    flow {
      delay(200.milliseconds)
      emit(
        ServerSentEvent(
          data = "One",
        ),
      )
      delay(50.milliseconds)
      emit(
        ServerSentEvent(
          data = "Two",
        ),
      )
    }
}

// when
val result = client.post("/sse")

// then
assertThat(result.status)
  .isEqualTo(HttpStatusCode.OK)
assertThat(result.contentType())
  .isEqualTo(ContentType.Text.EventStream.withCharsetIfNeeded(Charsets.UTF_8))
assertThat(result.bodyAsText())
  .isEqualTo("data: One\r\ndata: Two\r\n")

AI-Mocks

AI-Mocks is a specialized mock server implementations (e.g., mocking OpenAI API) built using Mokksy.

Mocking OpenAI API

MockOpenai is tested against official openai-java SDK and popular JVM AI frameworks: LangChain4j and Spring AI.

Currently, it only supports ChatCompletion and Streaming ChatCompletion requests.

Set up a mock server and define mock responses:

val openai = MockOpenai(verbose = true)

// Define mock response
openai.completion {
  temperature = temperatureValue
  seed = seedValue
  model = "gpt-4o-mini"
  maxCompletionTokens = maxCompletionTokensValue
  systemMessageContains("helpful assistant")
  userMessageContains("say 'Hello!'")
} responds {
  assistantContent = "Hello"
  finishReason = "stop"
}

// OpenAI client setup
val client: OpenAIClient =
  OpenAIOkHttpClient
    .builder()
    .apiKey("dummy-api-key")
    .baseUrl("http://127.0.0.1:${openai.port()}/v1") // connect to mock OpenAI
    .responseValidation(true)
    .build()

// Use the mock endpoint
val params =
  ChatCompletionCreateParams
    .builder()
    .temperature(temperatureValue)
    .maxCompletionTokens(maxCompletionTokensValue)
    .seed(seedValue.toLong())
    .messages(
      listOf(
        ChatCompletionMessageParam.ofSystem(
          ChatCompletionSystemMessageParam
            .builder()
            .content(
              "You are a helpful assistant.",
            ).build(),
        ),
        ChatCompletionMessageParam.ofUser(
          ChatCompletionUserMessageParam
            .builder()
            .content("Just say 'Hello!' and nothing else")
            .build(),
        ),
      ),
    ).model(ChatModel.GPT_4O_MINI)
    .build()

val result: ChatCompletion =
  client
    .chat()
    .completions()
    .create(params)

println(result)

Mocking negative scenarios

With AI-Mocks it is possible to test negative scenarios, such as erroneous responses and delays.

openai.completion {
  temperature = temperatureValue
  seed = seedValue
  model = modelName
  maxCompletionTokens = maxCompletionTokensValue
  systemMessageContains("helpful assistant")
  userMessageContains("say 'Hello!'")
} respondsError {
  body =
    // language=json
    """
    {
      "caramba": "Arrr, blast me barnacles! This be not what ye expect! 🏴‍☠️"
    }
    """.trimIndent()
  delay = 1.seconds
  httpStatus = HttpStatusCode.PaymentRequired
}

How to test LangChain4j/Kotlin

You may use also LangChain4J Kotlin Extensions:

val model: OpenAiChatModel =
  OpenAiChatModel
    .builder()
    .apiKey("dummy-api-key")
    .baseUrl("http://127.0.0.1:${openai.port()}/v1")
    .build()

val result =
  model.chatAsync {
    parameters =
      OpenAiChatRequestParameters
        .builder()
        .temperature(temperature)
        .modelName("gpt-4o-mini")
        .seed(seedValue)
        .build()
    messages += userMessage("Say Hello")
  }

println(result)

Stream Responses

Mock streaming responses easily with flow support:

// configure mock openai
openai.completion {
  temperature = temperatureValue
  model = "gpt-4o-mini"
  userMessageContains("What is in the sea?")
} respondsStream {
  responseFlow =
    flow {
      emit("Yellow")
      emit(" submarine")
    }
  finishReason = "stop"

  // send "[DONE]" as last message to finish the stream in openai4j
  sendDone = true
}

// create streaming model (a client)
val model: OpenAiStreamingChatModel =
  OpenAiStreamingChatModel
    .builder()
    .apiKey("foo")
    .baseUrl("http://127.0.0.1:${openai.port()}/v1")
    .build()

// call streaming model
model
  .chatFlow {
    parameters =
      ChatRequestParameters
        .builder()
        .temperature(temperatureValue)
        .modelName("gpt-4o-mini")
        .build()
    messages += userMessage(userMessage)
  }.collect {
    when (it) {
      is PartialResponse -> {
        println("token = ${it.token}")
      }

      is CompleteResponse -> {
        println("Completed: $it")
      }

      else -> {
        println("Something else = $it")
      }
    }
  }

How to test Spring-AI

To test Spring-AI integration run:

// create mock server
val openai = MockOpenai(verbose = true)

// create Spring-AI client
val chatClient =
  ChatClient
    .builder(
      org.springframework.ai.openai.OpenAiChatModel
        .builder()
        .openAiApi(
          OpenAiApi
            .builder()
            .apiKey("demo-key")
            .baseUrl("http://127.0.0.1:${openai.port()}")
            .build(),
        ).build(),
    ).build()

// Set up a mock for the LLM call
openai.completion {
  temperature = temperatureValue
  seed = seedValue
  model = modelName
  maxCompletionTokens = maxCompletionTokensValue
  systemMessageContains("helpful pirate")
  userMessageContains("say 'Hello!'")
} responds {
  assistantContent = "Ahoy there, matey! Hello!"
  finishReason = "stop"
}

// Configure Spring-AI client call
val response =
  chatClient
    .prompt()
    .system("You are a helpful pirate")
    .user("Just say 'Hello!'")
    .options<OpenAiChatOptions>(
      OpenAiChatOptions
        .builder()
        .maxCompletionTokens(maxCompletionTokensValue.toInt())
        .temperature(temperatureValue)
        .model(modelName)
        .seed(seedValue)
        .build(),
    )
    // Make a call
    .call()
    .chatResponse()

// Verify the response
response?.result shouldNotBe null
response?.result?.apply {
metadata.finishReason shouldBe "STOP"
output?.text shouldBe "Ahoy there, matey! Hello!"
}

Check for examples in the integration tests.

How to build

Building project locally:

gradle build

or using Make:

make

Contributing

I do welcome contributions! Please see the Contributing Guidelines for details.

Enjoying LLM integration testing? ❤️

Buy me a Coffee