Skip to content

invoke_model

Operation

invoke_model async

invoke_model(input: InvokeModelInput, plugins: list[Plugin] | None = None) -> InvokeModelOutput

Invokes the specified Amazon Bedrock model to run inference using the prompt and inference parameters provided in the request body. You use model inference to generate text, images, and embeddings.

For example code, see Invoke model code examples in the Amazon Bedrock User Guide.

This operation requires permission for the bedrock:InvokeModel action.

Warning

To deny all inference access to resources that you specify in the modelId field, you need to deny access to the bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream actions. Doing this also denies access to the resource through the Converse API actions (Converse and ConverseStream). For more information see Deny access for inference on specific models.

For troubleshooting some of the common errors you might encounter when using the InvokeModel API, see Troubleshooting Amazon Bedrock API Error Codes in the Amazon Bedrock User Guide

Parameters:

Name Type Description Default
input InvokeModelInput

An instance of InvokeModelInput.

required
plugins list[Plugin] | None

A list of callables that modify the configuration dynamically. Changes made by these plugins only apply for the duration of the operation execution and will not affect any other operation invocations.

None

Returns:

Type Description
InvokeModelOutput

An instance of InvokeModelOutput.

Source code in src/aws_sdk_bedrock_runtime/client.py
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
async def invoke_model(
    self, input: InvokeModelInput, plugins: list[Plugin] | None = None
) -> InvokeModelOutput:
    """Invokes the specified Amazon Bedrock model to run inference using the
    prompt and inference parameters provided in the request body. You use
    model inference to generate text, images, and embeddings.

    For example code, see *Invoke model code examples* in the *Amazon
    Bedrock User Guide*.

    This operation requires permission for the `bedrock:InvokeModel` action.

    Warning:
        To deny all inference access to resources that you specify in the
        modelId field, you need to deny access to the `bedrock:InvokeModel` and
        `bedrock:InvokeModelWithResponseStream` actions. Doing this also denies
        access to the resource through the Converse API actions
        ([Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)
        and
        [ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html)).
        For more information see [Deny access for inference on specific
        models](https://docs.aws.amazon.com/bedrock/latest/userguide/security_iam_id-based-policy-examples.html#security_iam_id-based-policy-examples-deny-inference).

    For troubleshooting some of the common errors you might encounter when
    using the `InvokeModel` API, see [Troubleshooting Amazon Bedrock API
    Error
    Codes](https://docs.aws.amazon.com/bedrock/latest/userguide/troubleshooting-api-error-codes.html)
    in the Amazon Bedrock User Guide

    Args:
        input:
            An instance of `InvokeModelInput`.
        plugins:
            A list of callables that modify the configuration dynamically.
            Changes made by these plugins only apply for the duration of the
            operation execution and will not affect any other operation
            invocations.

    Returns:
        An instance of `InvokeModelOutput`.
    """
    operation_plugins: list[Plugin] = []
    if plugins:
        operation_plugins.extend(plugins)
    config = deepcopy(self._config)
    for plugin in operation_plugins:
        plugin(config)
    if config.protocol is None or config.transport is None:
        raise ExpectationNotMetError(
            "protocol and transport MUST be set on the config to make calls."
        )
    pipeline = RequestPipeline(protocol=config.protocol, transport=config.transport)
    call = ClientCall(
        input=input,
        operation=INVOKE_MODEL,
        context=TypedProperties({"config": config}),
        interceptor=InterceptorChain(config.interceptors),
        auth_scheme_resolver=config.auth_scheme_resolver,
        supported_auth_schemes=config.auth_schemes,
        endpoint_resolver=config.endpoint_resolver,
        retry_strategy=config.retry_strategy,
    )

    return await pipeline(call)

Input

InvokeModelInput dataclass

Dataclass for InvokeModelInput structure.

Source code in src/aws_sdk_bedrock_runtime/models.py
12551
12552
12553
12554
12555
12556
12557
12558
12559
12560
12561
12562
12563
12564
12565
12566
12567
12568
12569
12570
12571
12572
12573
12574
12575
12576
12577
12578
12579
12580
12581
12582
12583
12584
12585
12586
12587
12588
12589
12590
12591
12592
12593
12594
12595
12596
12597
12598
12599
12600
12601
12602
12603
12604
12605
12606
12607
12608
12609
12610
12611
12612
12613
12614
12615
12616
12617
12618
12619
12620
12621
12622
12623
12624
12625
12626
12627
12628
12629
12630
12631
12632
12633
12634
12635
12636
12637
12638
12639
12640
12641
12642
12643
12644
12645
12646
12647
12648
12649
12650
12651
12652
12653
12654
12655
12656
12657
12658
12659
12660
12661
12662
12663
12664
12665
12666
12667
12668
12669
12670
12671
12672
12673
12674
12675
12676
12677
12678
12679
12680
12681
12682
12683
12684
12685
12686
12687
12688
12689
12690
12691
12692
12693
12694
12695
12696
12697
12698
12699
12700
12701
12702
12703
12704
12705
12706
12707
12708
12709
12710
12711
12712
12713
12714
12715
12716
12717
12718
12719
12720
12721
12722
12723
12724
12725
12726
12727
12728
12729
12730
12731
12732
12733
12734
12735
12736
12737
12738
12739
12740
12741
@dataclass(kw_only=True)
class InvokeModelInput:
    """Dataclass for InvokeModelInput structure."""

    body: bytes | None = field(repr=False, default=None)
    """The prompt and inference parameters in the format specified in the
    `contentType` in the header. You must provide the body in JSON format.
    To see the format and content of the request and response bodies for
    different models, refer to [Inference
    parameters](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
    For more information, see [Run
    inference](https://docs.aws.amazon.com/bedrock/latest/userguide/api-methods-run.html)
    in the Bedrock User Guide.
    """

    content_type: str | None = None
    """The MIME type of the input data in the request. You must specify
    `application/json`.
    """

    accept: str | None = None
    """The desired MIME type of the inference body in the response. The default
    value is `application/json`.
    """

    model_id: str | None = None
    """The unique identifier of the model to invoke to run inference.

    The `modelId` to provide depends on the type of model or throughput that
    you use:

    - If you use a base model, specify the model ID or its ARN. For a list
      of model IDs for base models, see [Amazon Bedrock base model IDs
      (on-demand
      throughput)](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html#model-ids-arns)
      in the Amazon Bedrock User Guide.

    - If you use an inference profile, specify the inference profile ID or
      its ARN. For a list of inference profile IDs, see [Supported Regions
      and models for cross-region
      inference](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference-support.html)
      in the Amazon Bedrock User Guide.

    - If you use a provisioned model, specify the ARN of the Provisioned
      Throughput. For more information, see [Run inference using a
      Provisioned
      Throughput](https://docs.aws.amazon.com/bedrock/latest/userguide/prov-thru-use.html)
      in the Amazon Bedrock User Guide.

    - If you use a custom model, specify the ARN of the custom model
      deployment (for on-demand inference) or the ARN of your provisioned
      model (for Provisioned Throughput). For more information, see [Use a
      custom model in Amazon
      Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-use.html)
      in the Amazon Bedrock User Guide.

    - If you use an [imported
      model](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-import-model.html),
      specify the ARN of the imported model. You can get the model ARN from
      a successful call to
      [CreateModelImportJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelImportJob.html)
      or from the Imported models page in the Amazon Bedrock console.
    """

    trace: str | None = None
    """Specifies whether to enable or disable the Bedrock trace. If enabled,
    you can see the full Bedrock trace.
    """

    guardrail_identifier: str | None = None
    """The unique identifier of the guardrail that you want to use. If you
    don't provide a value, no guardrail is applied to the invocation.

    An error will be thrown in the following situations.

    - You don't provide a guardrail identifier but you specify the
      `amazon-bedrock-guardrailConfig` field in the request body.

    - You enable the guardrail but the `contentType` isn't
      `application/json`.

    - You provide a guardrail identifier, but `guardrailVersion` isn't
      specified.
    """

    guardrail_version: str | None = None
    """The version number for the guardrail. The value can also be `DRAFT`."""

    performance_config_latency: str = "standard"
    """Model performance settings for the request."""

    def serialize(self, serializer: ShapeSerializer):
        serializer.write_struct(_SCHEMA_INVOKE_MODEL_INPUT, self)

    def serialize_members(self, serializer: ShapeSerializer):
        if self.body is not None:
            serializer.write_blob(_SCHEMA_INVOKE_MODEL_INPUT.members["body"], self.body)

        if self.content_type is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["contentType"], self.content_type
            )

        if self.accept is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["accept"], self.accept
            )

        if self.model_id is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["modelId"], self.model_id
            )

        if self.trace is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["trace"], self.trace
            )

        if self.guardrail_identifier is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["guardrailIdentifier"],
                self.guardrail_identifier,
            )

        if self.guardrail_version is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["guardrailVersion"],
                self.guardrail_version,
            )

        if self.performance_config_latency is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_INPUT.members["performanceConfigLatency"],
                self.performance_config_latency,
            )

    @classmethod
    def deserialize(cls, deserializer: ShapeDeserializer) -> Self:
        return cls(**cls.deserialize_kwargs(deserializer))

    @classmethod
    def deserialize_kwargs(cls, deserializer: ShapeDeserializer) -> dict[str, Any]:
        kwargs: dict[str, Any] = {}

        def _consumer(schema: Schema, de: ShapeDeserializer) -> None:
            match schema.expect_member_index():
                case 0:
                    kwargs["body"] = de.read_blob(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["body"]
                    )

                case 1:
                    kwargs["content_type"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["contentType"]
                    )

                case 2:
                    kwargs["accept"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["accept"]
                    )

                case 3:
                    kwargs["model_id"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["modelId"]
                    )

                case 4:
                    kwargs["trace"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["trace"]
                    )

                case 5:
                    kwargs["guardrail_identifier"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["guardrailIdentifier"]
                    )

                case 6:
                    kwargs["guardrail_version"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["guardrailVersion"]
                    )

                case 7:
                    kwargs["performance_config_latency"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_INPUT.members["performanceConfigLatency"]
                    )

                case _:
                    logger.debug("Unexpected member schema: %s", schema)

        deserializer.read_struct(_SCHEMA_INVOKE_MODEL_INPUT, consumer=_consumer)
        return kwargs

Attributes

accept class-attribute instance-attribute
accept: str | None = None

The desired MIME type of the inference body in the response. The default value is application/json.

body class-attribute instance-attribute
body: bytes | None = field(repr=False, default=None)

The prompt and inference parameters in the format specified in the contentType in the header. You must provide the body in JSON format. To see the format and content of the request and response bodies for different models, refer to Inference parameters. For more information, see Run inference in the Bedrock User Guide.

content_type class-attribute instance-attribute
content_type: str | None = None

The MIME type of the input data in the request. You must specify application/json.

guardrail_identifier class-attribute instance-attribute
guardrail_identifier: str | None = None

The unique identifier of the guardrail that you want to use. If you don't provide a value, no guardrail is applied to the invocation.

An error will be thrown in the following situations.

  • You don't provide a guardrail identifier but you specify the amazon-bedrock-guardrailConfig field in the request body.

  • You enable the guardrail but the contentType isn't application/json.

  • You provide a guardrail identifier, but guardrailVersion isn't specified.

guardrail_version class-attribute instance-attribute
guardrail_version: str | None = None

The version number for the guardrail. The value can also be DRAFT.

model_id class-attribute instance-attribute
model_id: str | None = None

The unique identifier of the model to invoke to run inference.

The modelId to provide depends on the type of model or throughput that you use:

performance_config_latency class-attribute instance-attribute
performance_config_latency: str = 'standard'

Model performance settings for the request.

trace class-attribute instance-attribute
trace: str | None = None

Specifies whether to enable or disable the Bedrock trace. If enabled, you can see the full Bedrock trace.

Output

InvokeModelOutput dataclass

Dataclass for InvokeModelOutput structure.

Source code in src/aws_sdk_bedrock_runtime/models.py
12744
12745
12746
12747
12748
12749
12750
12751
12752
12753
12754
12755
12756
12757
12758
12759
12760
12761
12762
12763
12764
12765
12766
12767
12768
12769
12770
12771
12772
12773
12774
12775
12776
12777
12778
12779
12780
12781
12782
12783
12784
12785
12786
12787
12788
12789
12790
12791
12792
12793
12794
12795
12796
12797
12798
12799
12800
12801
12802
12803
12804
@dataclass(kw_only=True)
class InvokeModelOutput:
    """Dataclass for InvokeModelOutput structure."""

    body: bytes = field(repr=False)
    """Inference response from the model in the format specified in the
    `contentType` header. To see the format and content of the request and
    response bodies for different models, refer to [Inference
    parameters](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).
    """

    content_type: str
    """The MIME type of the inference result."""

    performance_config_latency: str | None = None
    """Model performance settings for the request."""

    def serialize(self, serializer: ShapeSerializer):
        serializer.write_struct(_SCHEMA_INVOKE_MODEL_OUTPUT, self)

    def serialize_members(self, serializer: ShapeSerializer):
        serializer.write_blob(_SCHEMA_INVOKE_MODEL_OUTPUT.members["body"], self.body)
        serializer.write_string(
            _SCHEMA_INVOKE_MODEL_OUTPUT.members["contentType"], self.content_type
        )
        if self.performance_config_latency is not None:
            serializer.write_string(
                _SCHEMA_INVOKE_MODEL_OUTPUT.members["performanceConfigLatency"],
                self.performance_config_latency,
            )

    @classmethod
    def deserialize(cls, deserializer: ShapeDeserializer) -> Self:
        return cls(**cls.deserialize_kwargs(deserializer))

    @classmethod
    def deserialize_kwargs(cls, deserializer: ShapeDeserializer) -> dict[str, Any]:
        kwargs: dict[str, Any] = {}

        def _consumer(schema: Schema, de: ShapeDeserializer) -> None:
            match schema.expect_member_index():
                case 0:
                    kwargs["body"] = de.read_blob(
                        _SCHEMA_INVOKE_MODEL_OUTPUT.members["body"]
                    )

                case 1:
                    kwargs["content_type"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_OUTPUT.members["contentType"]
                    )

                case 2:
                    kwargs["performance_config_latency"] = de.read_string(
                        _SCHEMA_INVOKE_MODEL_OUTPUT.members["performanceConfigLatency"]
                    )

                case _:
                    logger.debug("Unexpected member schema: %s", schema)

        deserializer.read_struct(_SCHEMA_INVOKE_MODEL_OUTPUT, consumer=_consumer)
        return kwargs

Attributes

body class-attribute instance-attribute
body: bytes = field(repr=False)

Inference response from the model in the format specified in the contentType header. To see the format and content of the request and response bodies for different models, refer to Inference parameters.

content_type instance-attribute
content_type: str

The MIME type of the inference result.

performance_config_latency class-attribute instance-attribute
performance_config_latency: str | None = None

Model performance settings for the request.

Errors