Base inference parameters to pass to a model in a call to
Converse
or
ConverseStream.
For more information, see Inference parameters for foundation
models.
If you need to pass additional parameters that the model supports, use
the additionalModelRequestFields request field in the call to
Converse or ConverseStream. For more information, see Model
parameters.
Source code in src/aws_sdk_bedrock_runtime/models.py
5501
5502
5503
5504
5505
5506
5507
5508
5509
5510
5511
5512
5513
5514
5515
5516
5517
5518
5519
5520
5521
5522
5523
5524
5525
5526
5527
5528
5529
5530
5531
5532
5533
5534
5535
5536
5537
5538
5539
5540
5541
5542
5543
5544
5545
5546
5547
5548
5549
5550
5551
5552
5553
5554
5555
5556
5557
5558
5559
5560
5561
5562
5563
5564
5565
5566
5567
5568
5569
5570
5571
5572
5573
5574
5575
5576
5577
5578
5579
5580
5581
5582
5583
5584
5585
5586
5587
5588
5589
5590
5591
5592
5593
5594
5595
5596
5597
5598
5599
5600
5601
5602
5603
5604
5605
5606
5607
5608
5609
5610 | @dataclass(kw_only=True)
class InferenceConfiguration:
"""Base inference parameters to pass to a model in a call to
[Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)
or
[ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html).
For more information, see [Inference parameters for foundation
models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
If you need to pass additional parameters that the model supports, use
the `additionalModelRequestFields` request field in the call to
`Converse` or `ConverseStream`. For more information, see [Model
parameters](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
"""
max_tokens: int | None = None
"""The maximum number of tokens to allow in the generated response. The
default value is the maximum allowed value for the model that you are
using. For more information, see [Inference parameters for foundation
models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
"""
temperature: float | None = None
"""The likelihood of the model selecting higher-probability options while
generating a response. A lower value makes the model more likely to
choose higher-probability options, while a higher value makes the model
more likely to choose lower-probability options.
The default value is the default value for the model that you are using.
For more information, see [Inference parameters for foundation
models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
"""
top_p: float | None = None
"""The percentage of most-likely candidates that the model considers for
the next token. For example, if you choose a value of 0.8 for `topP`,
the model selects from the top 80% of the probability distribution of
tokens that could be next in the sequence.
The default value is the default value for the model that you are using.
For more information, see [Inference parameters for foundation
models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
"""
stop_sequences: list[str] | None = None
"""A list of stop sequences. A stop sequence is a sequence of characters
that causes the model to stop generating the response.
"""
def serialize(self, serializer: ShapeSerializer):
serializer.write_struct(_SCHEMA_INFERENCE_CONFIGURATION, self)
def serialize_members(self, serializer: ShapeSerializer):
if self.max_tokens is not None:
serializer.write_integer(
_SCHEMA_INFERENCE_CONFIGURATION.members["maxTokens"], self.max_tokens
)
if self.temperature is not None:
serializer.write_float(
_SCHEMA_INFERENCE_CONFIGURATION.members["temperature"], self.temperature
)
if self.top_p is not None:
serializer.write_float(
_SCHEMA_INFERENCE_CONFIGURATION.members["topP"], self.top_p
)
if self.stop_sequences is not None:
_serialize_non_empty_string_list(
serializer,
_SCHEMA_INFERENCE_CONFIGURATION.members["stopSequences"],
self.stop_sequences,
)
@classmethod
def deserialize(cls, deserializer: ShapeDeserializer) -> Self:
return cls(**cls.deserialize_kwargs(deserializer))
@classmethod
def deserialize_kwargs(cls, deserializer: ShapeDeserializer) -> dict[str, Any]:
kwargs: dict[str, Any] = {}
def _consumer(schema: Schema, de: ShapeDeserializer) -> None:
match schema.expect_member_index():
case 0:
kwargs["max_tokens"] = de.read_integer(
_SCHEMA_INFERENCE_CONFIGURATION.members["maxTokens"]
)
case 1:
kwargs["temperature"] = de.read_float(
_SCHEMA_INFERENCE_CONFIGURATION.members["temperature"]
)
case 2:
kwargs["top_p"] = de.read_float(
_SCHEMA_INFERENCE_CONFIGURATION.members["topP"]
)
case 3:
kwargs["stop_sequences"] = _deserialize_non_empty_string_list(
de, _SCHEMA_INFERENCE_CONFIGURATION.members["stopSequences"]
)
case _:
logger.debug("Unexpected member schema: %s", schema)
deserializer.read_struct(_SCHEMA_INFERENCE_CONFIGURATION, consumer=_consumer)
return kwargs
|
Attributes
max_tokens
class-attribute
instance-attribute
max_tokens: int | None = None
The maximum number of tokens to allow in the generated response. The
default value is the maximum allowed value for the model that you are
using. For more information, see Inference parameters for foundation
models.
stop_sequences
class-attribute
instance-attribute
stop_sequences: list[str] | None = None
A list of stop sequences. A stop sequence is a sequence of characters
that causes the model to stop generating the response.
temperature
class-attribute
instance-attribute
temperature: float | None = None
The likelihood of the model selecting higher-probability options while
generating a response. A lower value makes the model more likely to
choose higher-probability options, while a higher value makes the model
more likely to choose lower-probability options.
The default value is the default value for the model that you are using.
For more information, see Inference parameters for foundation
models.
top_p
class-attribute
instance-attribute
top_p: float | None = None
The percentage of most-likely candidates that the model considers for
the next token. For example, if you choose a value of 0.8 for topP,
the model selects from the top 80% of the probability distribution of
tokens that could be next in the sequence.
The default value is the default value for the model that you are using.
For more information, see Inference parameters for foundation
models.