invoke_model¶

Operation¶

invoke_model `async` ¶

invoke_model(input: InvokeModelInput, plugins: list[Plugin] | None = None) -> InvokeModelOutput

Invokes the specified Amazon Bedrock model to run inference using the prompt and inference parameters provided in the request body. You use model inference to generate text, images, and embeddings.

For example code, see Invoke model code examples in the Amazon Bedrock User Guide.

This operation requires permission for the bedrock:InvokeModel action.

Warning

To deny all inference access to resources that you specify in the modelId field, you need to deny access to the bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream actions. Doing this also denies access to the resource through the Converse API actions (Converse and ConverseStream). For more information see Deny access for inference on specific models.

For troubleshooting some of the common errors you might encounter when using the InvokeModel API, see Troubleshooting Amazon Bedrock API Error Codes in the Amazon Bedrock User Guide

Parameters:

Name	Type	Description	Default
`input`	`InvokeModelInput`	An instance of `InvokeModelInput`.	required
`plugins`	`list[Plugin] \| None`	A list of callables that modify the configuration dynamically. Changes made by these plugins only apply for the duration of the operation execution and will not affect any other operation invocations.	`None`

Returns:

Type	Description
`InvokeModelOutput`	An instance of `InvokeModelOutput`.

Source code in src/aws_sdk_bedrock_runtime/client.py

async def invoke_model(
    self, input: InvokeModelInput, plugins: list[Plugin] | None = None
) -> InvokeModelOutput:
    """Invokes the specified Amazon Bedrock model to run inference using the
    prompt and inference parameters provided in the request body. You use
    model inference to generate text, images, and embeddings.

    For example code, see *Invoke model code examples* in the *Amazon
    Bedrock User Guide*.

    This operation requires permission for the `bedrock:InvokeModel` action.

    Warning:
        To deny all inference access to resources that you specify in the
        modelId field, you need to deny access to the `bedrock:InvokeModel` and
        `bedrock:InvokeModelWithResponseStream` actions. Doing this also denies
        access to the resource through the Converse API actions
        ([Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)
        and
        [ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html)).
        For more information see [Deny access for inference on specific
        models](https://docs.aws.amazon.com/bedrock/latest/userguide/security_iam_id-based-policy-examples.html#security_iam_id-based-policy-examples-deny-inference).

    For troubleshooting some of the common errors you might encounter when
    using the `InvokeModel` API, see [Troubleshooting Amazon Bedrock API
    Error
    Codes](https://docs.aws.amazon.com/bedrock/latest/userguide/troubleshooting-api-error-codes.html)
    in the Amazon Bedrock User Guide

    Args:
        input:
            An instance of `InvokeModelInput`.
        plugins:
            A list of callables that modify the configuration dynamically.
            Changes made by these plugins only apply for the duration of the
            operation execution and will not affect any other operation
            invocations.

    Returns:
        An instance of `InvokeModelOutput`.
    """
    operation_plugins: list[Plugin] = []
    if plugins:
        operation_plugins.extend(plugins)
    config = deepcopy(self._config)
    for plugin in operation_plugins:
        plugin(config)
    if config.protocol is None or config.transport is None:
        raise ExpectationNotMetError(
            "protocol and transport MUST be set on the config to make calls."
        )
    pipeline = RequestPipeline(protocol=config.protocol, transport=config.transport)
    call = ClientCall(
        input=input,
        operation=INVOKE_MODEL,
        context=TypedProperties({"config": config}),
        interceptor=InterceptorChain(config.interceptors),
        auth_scheme_resolver=config.auth_scheme_resolver,
        supported_auth_schemes=config.auth_schemes,
        endpoint_resolver=config.endpoint_resolver,
        retry_strategy=config.retry_strategy,
    )

    return await pipeline(call)

Input¶

InvokeModelInput `dataclass` ¶

Dataclass for InvokeModelInput structure.

Source code in src/aws_sdk_bedrock_runtime/models.py

@dataclass(kw_only=True)
class InvokeModelInput:
    """Dataclass for InvokeModelInput structure."""

    body: bytes | None = field(repr=False, default=None)
    """The prompt and inference parameters in the format specified in the
    `contentType` in the header. You must provide the body in JSON format.
    To see the format and content of the request and response bodies for
    different models, refer to [Inference
    parameters](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
    For more information, see [Run
    inference](https://docs.aws.amazon.com/bedrock/latest/userguide/api-methods-run.html)
    in the Bedrock User Guide.
    """

    content_type: str | None = None
    """The MIME type of the input data in the request. You must specify
    `application/json`.
    """

    accept: str | None = None
    """The desired MIME type of the inference body in the response. The default
    value is `application/json`.
    """

    model_id: str | None = None
    """The unique identifier of the model to invoke to run inference.

    The `modelId` to provide depends on the type of model or throughput that
    you use:

    - If you use a base model, specify the model ID or its ARN. For a list
      of model IDs for base models, see [Amazon Bedrock base model IDs
      (on-demand
      throughput)](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html#model-ids-arns)
      in the Amazon Bedrock User Guide.

    - If you use an inference profile, specify the inference profile ID or
      its ARN. For a list of inference profile IDs, see [Supported Regions
      and models for cross-region
      inference](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference-support.html)
      in the Amazon Bedrock User Guide.

    - If you use a provisioned model, specify the ARN of the Provisioned
      Throughput. For more information, see [Run inference using a
      Provisioned
      Throughput](https://docs.aws.amazon.com/bedrock/latest/userguide/prov-thru-use.html)
      in the Amazon Bedrock User Guide.

    - If you use a custom model, specify the ARN of the custom model
      deployment (for on-demand inference) or the ARN of your provisioned
      model (for Provisioned Throughput). For more information, see [Use a
      custom model in Amazon
      Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-use.html)
      in the Amazon Bedrock User Guide.

    - If you use an [imported
      model](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-import-model.html),
      specify the ARN of the imported model. You can get the model ARN from
      a successful call to
      [CreateModelImportJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelImportJob.html)
      or from the Imported models page in the Amazon Bedrock console.
    """

    trace: str | None = None
    """Specifies whether to enable or disable the Bedrock trace. If enabled,
    you can see the full Bedrock trace.
    """

    guardrail_identifier: str | None = None
    """The unique identifier of the guardrail that you want to use. If you
    don't provide a value, no guardrail is applied to the invocation.

    An error will be thrown in the following situations.

    - You don't provide a guardrail identifier but you specify the
      `amazon-bedrock-guardrailConfig` field in the request body.

    - You enable the guardrail but the `contentType` isn't
      `application/json`.

    - You provide a guardrail identifier, but `guardrailVersion` isn't
      specified.
    """

    guardrail_version: str | None = None
    """The version number for the guardrail. The value can also be `DRAFT`."""

    performance_config_latency: str = "standard"
    """Model performance settings for the request."""

    def serialize(self, serializer: ShapeSerializer):
        serializer.write_struct(_SCHEMA_INVOKE_MODEL_INPUT, self)

    def serialize_members(self, serializer: ShapeSerializer):
        if self.body is not None:
            serializer.write_blob(_SCHEMA_INVOKE_MODEL_INPUT.members["body"], self.body)

        if self.content_type is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["contentType"], self.content_type
            )

        if self.accept is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["accept"], self.accept
            )

        if self.model_id is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["modelId"], self.model_id
            )

        if self.trace is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["trace"], self.trace
            )

        if self.guardrail_identifier is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["guardrailIdentifier"],
                self.guardrail_identifier,
            )

        if self.guardrail_version is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["guardrailVersion"],
                self.guardrail_version,
            )

        if self.performance_config_latency is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["performanceConfigLatency"],
                self.performance_config_latency,
            )

    @classmethod
    def deserialize(cls, deserializer: ShapeDeserializer) -> Self:
        return cls(**cls.deserialize_kwargs(deserializer))

    @classmethod
    def deserialize_kwargs(cls, deserializer: ShapeDeserializer) -> dict[str, Any]:
        kwargs: dict[str, Any] = {}

        def _consumer(schema: Schema, de: ShapeDeserializer) -> None:
            match schema.expect_member_index():
                case 0:
                    kwargs["body"] = de.read_blob(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["body"]
                    )

                case 1:
                    kwargs["content_type"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["contentType"]
                    )

                case 2:
                    kwargs["accept"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["accept"]
                    )

                case 3:
                    kwargs["model_id"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["modelId"]
                    )

                case 4:
                    kwargs["trace"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["trace"]
                    )

                case 5:
                    kwargs["guardrail_identifier"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["guardrailIdentifier"]
                    )

                case 6:
                    kwargs["guardrail_version"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["guardrailVersion"]
                    )

                case 7:
                    kwargs["performance_config_latency"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["performanceConfigLatency"]
                    )

                case _:
                    logger.debug("Unexpected member schema: %s", schema)

        deserializer.read_struct(_SCHEMA_INVOKE_MODEL_INPUT, consumer=_consumer)
        return kwargs

Attributes¶

accept `class-attribute` `instance-attribute` ¶

accept: str | None = None

The desired MIME type of the inference body in the response. The default value is application/json.

body `class-attribute` `instance-attribute` ¶

body: bytes | None = field(repr=False, default=None)

The prompt and inference parameters in the format specified in the contentType in the header. You must provide the body in JSON format. To see the format and content of the request and response bodies for different models, refer to Inference parameters. For more information, see Run inference in the Bedrock User Guide.

content_type `class-attribute` `instance-attribute` ¶

content_type: str | None = None

The MIME type of the input data in the request. You must specify application/json.

guardrail_identifier `class-attribute` `instance-attribute` ¶

guardrail_identifier: str | None = None

The unique identifier of the guardrail that you want to use. If you don't provide a value, no guardrail is applied to the invocation.

An error will be thrown in the following situations.

You don't provide a guardrail identifier but you specify the amazon-bedrock-guardrailConfig field in the request body.
You enable the guardrail but the contentType isn't application/json.
You provide a guardrail identifier, but guardrailVersion isn't specified.

guardrail_version `class-attribute` `instance-attribute` ¶

guardrail_version: str | None = None

The version number for the guardrail. The value can also be DRAFT.

model_id `class-attribute` `instance-attribute` ¶

model_id: str | None = None

The unique identifier of the model to invoke to run inference.

The modelId to provide depends on the type of model or throughput that you use:

If you use a base model, specify the model ID or its ARN. For a list of model IDs for base models, see Amazon Bedrock base model IDs (on-demand throughput) in the Amazon Bedrock User Guide.
If you use an inference profile, specify the inference profile ID or its ARN. For a list of inference profile IDs, see Supported Regions and models for cross-region inference in the Amazon Bedrock User Guide.
If you use a provisioned model, specify the ARN of the Provisioned Throughput. For more information, see Run inference using a Provisioned Throughput in the Amazon Bedrock User Guide.
If you use a custom model, specify the ARN of the custom model deployment (for on-demand inference) or the ARN of your provisioned model (for Provisioned Throughput). For more information, see Use a custom model in Amazon Bedrock in the Amazon Bedrock User Guide.
If you use an imported model, specify the ARN of the imported model. You can get the model ARN from a successful call to CreateModelImportJob or from the Imported models page in the Amazon Bedrock console.

performance_config_latency `class-attribute` `instance-attribute` ¶

performance_config_latency: str = 'standard'

Model performance settings for the request.

trace `class-attribute` `instance-attribute` ¶

trace: str | None = None

Specifies whether to enable or disable the Bedrock trace. If enabled, you can see the full Bedrock trace.

Output¶

InvokeModelOutput `dataclass` ¶

Dataclass for InvokeModelOutput structure.

Source code in src/aws_sdk_bedrock_runtime/models.py

@dataclass(kw_only=True)
class InvokeModelOutput:
    """Dataclass for InvokeModelOutput structure."""

    body: bytes = field(repr=False)
    """Inference response from the model in the format specified in the
    `contentType` header. To see the format and content of the request and
    response bodies for different models, refer to [Inference
    parameters](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
    """

    content_type: str
    """The MIME type of the inference result."""

    performance_config_latency: str | None = None
    """Model performance settings for the request."""

    def serialize(self, serializer: ShapeSerializer):
        serializer.write_struct(_SCHEMA_INVOKE_MODEL_OUTPUT, self)

    def serialize_members(self, serializer: ShapeSerializer):
        serializer.write_blob(_SCHEMA_INVOKE_MODEL_OUTPUT.members["body"], self.body)
        serializer.write_string(
            _SCHEMA_INVOKE_MODEL_OUTPUT.members["contentType"], self.content_type
        )
        if self.performance_config_latency is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_OUTPUT.members["performanceConfigLatency"],
                self.performance_config_latency,
            )

    @classmethod
    def deserialize(cls, deserializer: ShapeDeserializer) -> Self:
        return cls(**cls.deserialize_kwargs(deserializer))

    @classmethod
    def deserialize_kwargs(cls, deserializer: ShapeDeserializer) -> dict[str, Any]:
        kwargs: dict[str, Any] = {}

        def _consumer(schema: Schema, de: ShapeDeserializer) -> None:
            match schema.expect_member_index():
                case 0:
                    kwargs["body"] = de.read_blob(
                        _SCHEMA_INVOKE_MODEL_OUTPUT.members["body"]
                    )

                case 1:
                    kwargs["content_type"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_OUTPUT.members["contentType"]
                    )

                case 2:
                    kwargs["performance_config_latency"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_OUTPUT.members["performanceConfigLatency"]
                    )

                case _:
                    logger.debug("Unexpected member schema: %s", schema)

        deserializer.read_struct(_SCHEMA_INVOKE_MODEL_OUTPUT, consumer=_consumer)
        return kwargs

Attributes¶

body `class-attribute` `instance-attribute` ¶

body: bytes = field(repr=False)

Inference response from the model in the format specified in the contentType header. To see the format and content of the request and response bodies for different models, refer to Inference parameters.

content_type `instance-attribute` ¶

content_type: str

The MIME type of the inference result.

performance_config_latency `class-attribute` `instance-attribute` ¶

performance_config_latency: str | None = None

Model performance settings for the request.

invoke_model¶

Operation¶

invoke_model async ¶

Input¶

InvokeModelInput dataclass ¶

Attributes¶

accept class-attribute instance-attribute ¶

body class-attribute instance-attribute ¶

content_type class-attribute instance-attribute ¶

guardrail_identifier class-attribute instance-attribute ¶

guardrail_version class-attribute instance-attribute ¶

model_id class-attribute instance-attribute ¶

performance_config_latency class-attribute instance-attribute ¶

trace class-attribute instance-attribute ¶

Output¶

InvokeModelOutput dataclass ¶

Attributes¶

body class-attribute instance-attribute ¶

content_type instance-attribute ¶

performance_config_latency class-attribute instance-attribute ¶

Errors¶

invoke_model `async` ¶

InvokeModelInput `dataclass` ¶

accept `class-attribute` `instance-attribute` ¶

body `class-attribute` `instance-attribute` ¶

content_type `class-attribute` `instance-attribute` ¶

guardrail_identifier `class-attribute` `instance-attribute` ¶

guardrail_version `class-attribute` `instance-attribute` ¶

model_id `class-attribute` `instance-attribute` ¶

performance_config_latency `class-attribute` `instance-attribute` ¶

trace `class-attribute` `instance-attribute` ¶

InvokeModelOutput `dataclass` ¶

body `class-attribute` `instance-attribute` ¶

content_type `instance-attribute` ¶

performance_config_latency `class-attribute` `instance-attribute` ¶