Skip to content

count_tokens

Operation

count_tokens async

count_tokens(input: CountTokensOperationInput, plugins: list[Plugin] | None = None) -> CountTokensOutput

Returns the token count for a given inference request. This operation helps you estimate token usage before sending requests to foundation models by returning the token count that would be used if the same input were sent to the model in an inference request.

Token counting is model-specific because different models use different tokenization strategies. The token count returned by this operation will match the token count that would be charged if the same input were sent to the model in an InvokeModel or Converse request.

You can use this operation to:

  • Estimate costs before sending inference requests.

  • Optimize prompts to fit within token limits.

  • Plan for token usage in your applications.

This operation accepts the same input formats as InvokeModel and Converse, allowing you to count tokens for both raw text inputs and structured conversation formats.

The following operations are related to CountTokens:

  • InvokeModel - Sends inference requests to foundation models

  • Converse - Sends conversation-based inference requests to foundation models

Parameters:

Name Type Description Default
input CountTokensOperationInput

An instance of CountTokensOperationInput.

required
plugins list[Plugin] | None

A list of callables that modify the configuration dynamically. Changes made by these plugins only apply for the duration of the operation execution and will not affect any other operation invocations.

None

Returns:

Type Description
CountTokensOutput

An instance of CountTokensOutput.

Source code in src/aws_sdk_bedrock_runtime/client.py
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
async def count_tokens(
    self, input: CountTokensOperationInput, plugins: list[Plugin] | None = None
) -> CountTokensOutput:
    """Returns the token count for a given inference request. This operation
    helps you estimate token usage before sending requests to foundation
    models by returning the token count that would be used if the same input
    were sent to the model in an inference request.

    Token counting is model-specific because different models use different
    tokenization strategies. The token count returned by this operation will
    match the token count that would be charged if the same input were sent
    to the model in an `InvokeModel` or `Converse` request.

    You can use this operation to:

    - Estimate costs before sending inference requests.

    - Optimize prompts to fit within token limits.

    - Plan for token usage in your applications.

    This operation accepts the same input formats as `InvokeModel` and
    `Converse`, allowing you to count tokens for both raw text inputs and
    structured conversation formats.

    The following operations are related to `CountTokens`:

    - [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/API/API_runtime_InvokeModel.html) -
      Sends inference requests to foundation models

    - [Converse](https://docs.aws.amazon.com/bedrock/latest/API/API_runtime_Converse.html) -
      Sends conversation-based inference requests to foundation models

    Args:
        input:
            An instance of `CountTokensOperationInput`.
        plugins:
            A list of callables that modify the configuration dynamically.
            Changes made by these plugins only apply for the duration of the
            operation execution and will not affect any other operation
            invocations.

    Returns:
        An instance of `CountTokensOutput`.
    """
    operation_plugins: list[Plugin] = []
    if plugins:
        operation_plugins.extend(plugins)
    config = deepcopy(self._config)
    for plugin in operation_plugins:
        plugin(config)
    if config.protocol is None or config.transport is None:
        raise ExpectationNotMetError(
            "protocol and transport MUST be set on the config to make calls."
        )
    pipeline = RequestPipeline(protocol=config.protocol, transport=config.transport)
    call = ClientCall(
        input=input,
        operation=COUNT_TOKENS,
        context=TypedProperties({"config": config}),
        interceptor=InterceptorChain(config.interceptors),
        auth_scheme_resolver=config.auth_scheme_resolver,
        supported_auth_schemes=config.auth_schemes,
        endpoint_resolver=config.endpoint_resolver,
        retry_strategy=config.retry_strategy,
    )

    return await pipeline(call)

Input

CountTokensOperationInput dataclass

Dataclass for CountTokensOperationInput structure.

Source code in src/aws_sdk_bedrock_runtime/models.py
14257
14258
14259
14260
14261
14262
14263
14264
14265
14266
14267
14268
14269
14270
14271
14272
14273
14274
14275
14276
14277
14278
14279
14280
14281
14282
14283
14284
14285
14286
14287
14288
14289
14290
14291
14292
14293
14294
14295
14296
14297
14298
14299
14300
14301
14302
14303
14304
14305
14306
14307
14308
14309
14310
14311
14312
14313
14314
14315
14316
14317
14318
14319
14320
@dataclass(kw_only=True)
class CountTokensOperationInput:
    """Dataclass for CountTokensOperationInput structure."""

    model_id: str | None = None
    """The unique identifier or ARN of the foundation model to use for token
    counting. Each model processes tokens differently, so the token count is
    specific to the model you specify.
    """

    input: CountTokensInput | None = None
    """The input for which to count tokens. The structure of this parameter
    depends on whether you're counting tokens for an `InvokeModel` or
    `Converse` request:

    - For `InvokeModel` requests, provide the request body in the
      `invokeModel` field

    - For `Converse` requests, provide the messages and system content in
      the `converse` field

    The input format must be compatible with the model specified in the
    `modelId` parameter.
    """

    def serialize(self, serializer: ShapeSerializer):
        serializer.write_struct(_SCHEMA_COUNT_TOKENS_OPERATION_INPUT, self)

    def serialize_members(self, serializer: ShapeSerializer):
        if self.model_id is not None:
            serializer.write_string(
                _SCHEMA_COUNT_TOKENS_OPERATION_INPUT.members["modelId"], self.model_id
            )

        if self.input is not None:
            serializer.write_struct(
                _SCHEMA_COUNT_TOKENS_OPERATION_INPUT.members["input"], self.input
            )

    @classmethod
    def deserialize(cls, deserializer: ShapeDeserializer) -> Self:
        return cls(**cls.deserialize_kwargs(deserializer))

    @classmethod
    def deserialize_kwargs(cls, deserializer: ShapeDeserializer) -> dict[str, Any]:
        kwargs: dict[str, Any] = {}

        def _consumer(schema: Schema, de: ShapeDeserializer) -> None:
            match schema.expect_member_index():
                case 0:
                    kwargs["model_id"] = de.read_string(
                        _SCHEMA_COUNT_TOKENS_OPERATION_INPUT.members["modelId"]
                    )

                case 1:
                    kwargs["input"] = _CountTokensInputDeserializer().deserialize(de)

                case _:
                    logger.debug("Unexpected member schema: %s", schema)

        deserializer.read_struct(
            _SCHEMA_COUNT_TOKENS_OPERATION_INPUT, consumer=_consumer
        )
        return kwargs

Attributes

input class-attribute instance-attribute
input: CountTokensInput | None = None

The input for which to count tokens. The structure of this parameter depends on whether you're counting tokens for an InvokeModel or Converse request:

  • For InvokeModel requests, provide the request body in the invokeModel field

  • For Converse requests, provide the messages and system content in the converse field

The input format must be compatible with the model specified in the modelId parameter.

model_id class-attribute instance-attribute
model_id: str | None = None

The unique identifier or ARN of the foundation model to use for token counting. Each model processes tokens differently, so the token count is specific to the model you specify.

Output

CountTokensOutput dataclass

Dataclass for CountTokensOutput structure.

Source code in src/aws_sdk_bedrock_runtime/models.py
14323
14324
14325
14326
14327
14328
14329
14330
14331
14332
14333
14334
14335
14336
14337
14338
14339
14340
14341
14342
14343
14344
14345
14346
14347
14348
14349
14350
14351
14352
14353
14354
14355
14356
14357
14358
14359
14360
14361
14362
@dataclass(kw_only=True)
class CountTokensOutput:
    """Dataclass for CountTokensOutput structure."""

    input_tokens: int
    """The number of tokens in the provided input according to the specified
    model's tokenization rules. This count represents the number of input
    tokens that would be processed if the same input were sent to the model
    in an inference request. Use this value to estimate costs and ensure
    your inputs stay within model token limits.
    """

    def serialize(self, serializer: ShapeSerializer):
        serializer.write_struct(_SCHEMA_COUNT_TOKENS_OUTPUT, self)

    def serialize_members(self, serializer: ShapeSerializer):
        serializer.write_integer(
            _SCHEMA_COUNT_TOKENS_OUTPUT.members["inputTokens"], self.input_tokens
        )

    @classmethod
    def deserialize(cls, deserializer: ShapeDeserializer) -> Self:
        return cls(**cls.deserialize_kwargs(deserializer))

    @classmethod
    def deserialize_kwargs(cls, deserializer: ShapeDeserializer) -> dict[str, Any]:
        kwargs: dict[str, Any] = {}

        def _consumer(schema: Schema, de: ShapeDeserializer) -> None:
            match schema.expect_member_index():
                case 0:
                    kwargs["input_tokens"] = de.read_integer(
                        _SCHEMA_COUNT_TOKENS_OUTPUT.members["inputTokens"]
                    )

                case _:
                    logger.debug("Unexpected member schema: %s", schema)

        deserializer.read_struct(_SCHEMA_COUNT_TOKENS_OUTPUT, consumer=_consumer)
        return kwargs

Attributes

input_tokens instance-attribute
input_tokens: int

The number of tokens in the provided input according to the specified model's tokenization rules. This count represents the number of input tokens that would be processed if the same input were sent to the model in an inference request. Use this value to estimate costs and ensure your inputs stay within model token limits.

Errors