InferenceConfiguration `dataclass` ¶

Base inference parameters to pass to a model in a call to Converse or ConverseStream. For more information, see Inference parameters for foundation models.

If you need to pass additional parameters that the model supports, use the additionalModelRequestFields request field in the call to Converse or ConverseStream. For more information, see Model parameters.

Source code in src/aws_sdk_bedrock_runtime/models.py

@dataclass(kw_only=True)
class InferenceConfiguration:
    """Base inference parameters to pass to a model in a call to
    [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)
    or
    [ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html).
    For more information, see [Inference parameters for foundation
    models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).

    If you need to pass additional parameters that the model supports, use
    the `additionalModelRequestFields` request field in the call to
    `Converse` or `ConverseStream`. For more information, see [Model
    parameters](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
    """

    max_tokens: int | None = None
    """The maximum number of tokens to allow in the generated response. The
    default value is the maximum allowed value for the model that you are
    using. For more information, see [Inference parameters for foundation
    models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
    """

    temperature: float | None = None
    """The likelihood of the model selecting higher-probability options while
    generating a response. A lower value makes the model more likely to
    choose higher-probability options, while a higher value makes the model
    more likely to choose lower-probability options.

    The default value is the default value for the model that you are using.
    For more information, see [Inference parameters for foundation
    models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
    """

    top_p: float | None = None
    """The percentage of most-likely candidates that the model considers for
    the next token. For example, if you choose a value of 0.8 for `topP`,
    the model selects from the top 80% of the probability distribution of
    tokens that could be next in the sequence.

    The default value is the default value for the model that you are using.
    For more information, see [Inference parameters for foundation
    models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
    """

    stop_sequences: list[str] | None = None
    """A list of stop sequences. A stop sequence is a sequence of characters
    that causes the model to stop generating the response.
    """

    def serialize(self, serializer: ShapeSerializer):
        serializer.write_struct(_SCHEMA_INFERENCE_CONFIGURATION, self)

    def serialize_members(self, serializer: ShapeSerializer):
        if self.max_tokens is not None:
            serializer.write_integer(
                _SCHEMA_INFERENCE_CONFIGURATION.members["maxTokens"], self.max_tokens
            )

        if self.temperature is not None:
            serializer.write_float(
                _SCHEMA_INFERENCE_CONFIGURATION.members["temperature"], self.temperature
            )

        if self.top_p is not None:
            serializer.write_float(
                _SCHEMA_INFERENCE_CONFIGURATION.members["topP"], self.top_p
            )

        if self.stop_sequences is not None:
            _serialize_non_empty_string_list(
                serializer,
                _SCHEMA_INFERENCE_CONFIGURATION.members["stopSequences"],
                self.stop_sequences,
            )

    @classmethod
    def deserialize(cls, deserializer: ShapeDeserializer) -> Self:
        return cls(**cls.deserialize_kwargs(deserializer))

    @classmethod
    def deserialize_kwargs(cls, deserializer: ShapeDeserializer) -> dict[str, Any]:
        kwargs: dict[str, Any] = {}

        def _consumer(schema: Schema, de: ShapeDeserializer) -> None:
            match schema.expect_member_index():
                case 0:
                    kwargs["max_tokens"] = de.read_integer(
                        _SCHEMA_INFERENCE_CONFIGURATION.members["maxTokens"]
                    )

                case 1:
                    kwargs["temperature"] = de.read_float(
                        _SCHEMA_INFERENCE_CONFIGURATION.members["temperature"]
                    )

                case 2:
                    kwargs["top_p"] = de.read_float(
                        _SCHEMA_INFERENCE_CONFIGURATION.members["topP"]
                    )

                case 3:
                    kwargs["stop_sequences"] = _deserialize_non_empty_string_list(
                        de, _SCHEMA_INFERENCE_CONFIGURATION.members["stopSequences"]
                    )

                case _:
                    logger.debug("Unexpected member schema: %s", schema)

        deserializer.read_struct(_SCHEMA_INFERENCE_CONFIGURATION, consumer=_consumer)
        return kwargs

Attributes¶

max_tokens `class-attribute` `instance-attribute` ¶

max_tokens: int | None = None

The maximum number of tokens to allow in the generated response. The default value is the maximum allowed value for the model that you are using. For more information, see Inference parameters for foundation models.

stop_sequences `class-attribute` `instance-attribute` ¶

stop_sequences: list[str] | None = None

A list of stop sequences. A stop sequence is a sequence of characters that causes the model to stop generating the response.

temperature `class-attribute` `instance-attribute` ¶

temperature: float | None = None

The likelihood of the model selecting higher-probability options while generating a response. A lower value makes the model more likely to choose higher-probability options, while a higher value makes the model more likely to choose lower-probability options.

The default value is the default value for the model that you are using. For more information, see Inference parameters for foundation models.

top_p `class-attribute` `instance-attribute` ¶

top_p: float | None = None

The percentage of most-likely candidates that the model considers for the next token. For example, if you choose a value of 0.8 for topP, the model selects from the top 80% of the probability distribution of tokens that could be next in the sequence.

The default value is the default value for the model that you are using. For more information, see Inference parameters for foundation models.

InferenceConfiguration dataclass ¶

Attributes¶

max_tokens class-attribute instance-attribute ¶

stop_sequences class-attribute instance-attribute ¶

temperature class-attribute instance-attribute ¶

top_p class-attribute instance-attribute ¶

InferenceConfiguration `dataclass` ¶

max_tokens `class-attribute` `instance-attribute` ¶

stop_sequences `class-attribute` `instance-attribute` ¶

temperature `class-attribute` `instance-attribute` ¶

top_p `class-attribute` `instance-attribute` ¶