count_tokens¶

Operation¶

count_tokens `async` ¶

count_tokens(input: CountTokensOperationInput, plugins: list[Plugin] | None = None) -> CountTokensOutput

Returns the token count for a given inference request. This operation helps you estimate token usage before sending requests to foundation models by returning the token count that would be used if the same input were sent to the model in an inference request.

Token counting is model-specific because different models use different tokenization strategies. The token count returned by this operation will match the token count that would be charged if the same input were sent to the model in an InvokeModel or Converse request.

You can use this operation to:

Estimate costs before sending inference requests.
Optimize prompts to fit within token limits.
Plan for token usage in your applications.

This operation accepts the same input formats as InvokeModel and Converse, allowing you to count tokens for both raw text inputs and structured conversation formats.

The following operations are related to CountTokens:

InvokeModel - Sends inference requests to foundation models
Converse - Sends conversation-based inference requests to foundation models

Parameters:

Name	Type	Description	Default
`input`	`CountTokensOperationInput`	An instance of `CountTokensOperationInput`.	required
`plugins`	`list[Plugin] \| None`	A list of callables that modify the configuration dynamically. Changes made by these plugins only apply for the duration of the operation execution and will not affect any other operation invocations.	`None`

Returns:

Type	Description
`CountTokensOutput`	An instance of `CountTokensOutput`.

Source code in src/aws_sdk_bedrock_runtime/client.py

async def count_tokens(
    self, input: CountTokensOperationInput, plugins: list[Plugin] | None = None
) -> CountTokensOutput:
    """Returns the token count for a given inference request. This operation
    helps you estimate token usage before sending requests to foundation
    models by returning the token count that would be used if the same input
    were sent to the model in an inference request.

    Token counting is model-specific because different models use different
    tokenization strategies. The token count returned by this operation will
    match the token count that would be charged if the same input were sent
    to the model in an `InvokeModel` or `Converse` request.

    You can use this operation to:

    - Estimate costs before sending inference requests.

    - Optimize prompts to fit within token limits.

    - Plan for token usage in your applications.

    This operation accepts the same input formats as `InvokeModel` and
    `Converse`, allowing you to count tokens for both raw text inputs and
    structured conversation formats.

    The following operations are related to `CountTokens`:

    - [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/API/API_runtime_InvokeModel.html) -
      Sends inference requests to foundation models

    - [Converse](https://docs.aws.amazon.com/bedrock/latest/API/API_runtime_Converse.html) -
      Sends conversation-based inference requests to foundation models

    Args:
        input:
            An instance of `CountTokensOperationInput`.
        plugins:
            A list of callables that modify the configuration dynamically.
            Changes made by these plugins only apply for the duration of the
            operation execution and will not affect any other operation
            invocations.

    Returns:
        An instance of `CountTokensOutput`.
    """
    operation_plugins: list[Plugin] = []
    if plugins:
        operation_plugins.extend(plugins)
    config = deepcopy(self._config)
    for plugin in operation_plugins:
        plugin(config)
    if config.protocol is None or config.transport is None:
        raise ExpectationNotMetError(
            "protocol and transport MUST be set on the config to make calls."
        )
    pipeline = RequestPipeline(protocol=config.protocol, transport=config.transport)
    call = ClientCall(
        input=input,
        operation=COUNT_TOKENS,
        context=TypedProperties({"config": config}),
        interceptor=InterceptorChain(config.interceptors),
        auth_scheme_resolver=config.auth_scheme_resolver,
        supported_auth_schemes=config.auth_schemes,
        endpoint_resolver=config.endpoint_resolver,
        retry_strategy=config.retry_strategy,
    )

    return await pipeline(call)

Input¶

CountTokensOperationInput `dataclass` ¶

Dataclass for CountTokensOperationInput structure.

Source code in src/aws_sdk_bedrock_runtime/models.py

@dataclass(kw_only=True)
class CountTokensOperationInput:
    """Dataclass for CountTokensOperationInput structure."""

    model_id: str | None = None
    """The unique identifier or ARN of the foundation model to use for token
    counting. Each model processes tokens differently, so the token count is
    specific to the model you specify.
    """

    input: CountTokensInput | None = None
    """The input for which to count tokens. The structure of this parameter
    depends on whether you're counting tokens for an `InvokeModel` or
    `Converse` request:

    - For `InvokeModel` requests, provide the request body in the
      `invokeModel` field

    - For `Converse` requests, provide the messages and system content in
      the `converse` field

    The input format must be compatible with the model specified in the
    `modelId` parameter.
    """

    def serialize(self, serializer: ShapeSerializer):
        serializer.write_struct(_SCHEMA_COUNT_TOKENS_OPERATION_INPUT, self)

    def serialize_members(self, serializer: ShapeSerializer):
        if self.model_id is not None:
            serializer.write_string(
                _SCHEMA_COUNT_TOKENS_OPERATION_INPUT.members["modelId"], self.model_id
            )

        if self.input is not None:
            serializer.write_struct(
                _SCHEMA_COUNT_TOKENS_OPERATION_INPUT.members["input"], self.input
            )

    @classmethod
    def deserialize(cls, deserializer: ShapeDeserializer) -> Self:
        return cls(**cls.deserialize_kwargs(deserializer))

    @classmethod
    def deserialize_kwargs(cls, deserializer: ShapeDeserializer) -> dict[str, Any]:
        kwargs: dict[str, Any] = {}

        def _consumer(schema: Schema, de: ShapeDeserializer) -> None:
            match schema.expect_member_index():
                case 0:
                    kwargs["model_id"] = de.read_string(
                        _SCHEMA_COUNT_TOKENS_OPERATION_INPUT.members["modelId"]
                    )

                case 1:
                    kwargs["input"] = _CountTokensInputDeserializer().deserialize(de)

                case _:
                    logger.debug("Unexpected member schema: %s", schema)

        deserializer.read_struct(
            _SCHEMA_COUNT_TOKENS_OPERATION_INPUT, consumer=_consumer
        )
        return kwargs

Attributes¶

input `class-attribute` `instance-attribute` ¶

input: CountTokensInput | None = None

The input for which to count tokens. The structure of this parameter depends on whether you're counting tokens for an InvokeModel or Converse request:

For InvokeModel requests, provide the request body in the invokeModel field
For Converse requests, provide the messages and system content in the converse field

The input format must be compatible with the model specified in the modelId parameter.

model_id `class-attribute` `instance-attribute` ¶

model_id: str | None = None

The unique identifier or ARN of the foundation model to use for token counting. Each model processes tokens differently, so the token count is specific to the model you specify.

Output¶

CountTokensOutput `dataclass` ¶

Dataclass for CountTokensOutput structure.

Source code in src/aws_sdk_bedrock_runtime/models.py

@dataclass(kw_only=True)
class CountTokensOutput:
    """Dataclass for CountTokensOutput structure."""

    input_tokens: int
    """The number of tokens in the provided input according to the specified
    model's tokenization rules. This count represents the number of input
    tokens that would be processed if the same input were sent to the model
    in an inference request. Use this value to estimate costs and ensure
    your inputs stay within model token limits.
    """

    def serialize(self, serializer: ShapeSerializer):
        serializer.write_struct(_SCHEMA_COUNT_TOKENS_OUTPUT, self)

    def serialize_members(self, serializer: ShapeSerializer):
        serializer.write_integer(
            _SCHEMA_COUNT_TOKENS_OUTPUT.members["inputTokens"], self.input_tokens
        )

    @classmethod
    def deserialize(cls, deserializer: ShapeDeserializer) -> Self:
        return cls(**cls.deserialize_kwargs(deserializer))

    @classmethod
    def deserialize_kwargs(cls, deserializer: ShapeDeserializer) -> dict[str, Any]:
        kwargs: dict[str, Any] = {}

        def _consumer(schema: Schema, de: ShapeDeserializer) -> None:
            match schema.expect_member_index():
                case 0:
                    kwargs["input_tokens"] = de.read_integer(
                        _SCHEMA_COUNT_TOKENS_OUTPUT.members["inputTokens"]
                    )

                case _:
                    logger.debug("Unexpected member schema: %s", schema)

        deserializer.read_struct(_SCHEMA_COUNT_TOKENS_OUTPUT, consumer=_consumer)
        return kwargs

Attributes¶

input_tokens `instance-attribute` ¶

input_tokens: int

The number of tokens in the provided input according to the specified model's tokenization rules. This count represents the number of input tokens that would be processed if the same input were sent to the model in an inference request. Use this value to estimate costs and ensure your inputs stay within model token limits.

count_tokens¶

Operation¶

count_tokens async ¶

Input¶

CountTokensOperationInput dataclass ¶

Attributes¶

input class-attribute instance-attribute ¶

model_id class-attribute instance-attribute ¶

Output¶

CountTokensOutput dataclass ¶

Attributes¶

input_tokens instance-attribute ¶

Errors¶

count_tokens `async` ¶

CountTokensOperationInput `dataclass` ¶

input `class-attribute` `instance-attribute` ¶

model_id `class-attribute` `instance-attribute` ¶

CountTokensOutput `dataclass` ¶

input_tokens `instance-attribute` ¶