[ワークショップ] ユーザーエクスペリエンスを高める生成AI関連のワークショップに参加してきた #LFS402 #AWSreInvent

AWS re:Invent 2023

#AWS

#Amazon Kendra

#Amazon Lex

hayashi.masaya

2023.11.29

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

re:Invent 2023 いかがお過ごしでしょうか？
大阪オフィスの林です。

「生成AI」というトレンドに "利用者" としては乗っているもののインフラ側の仕組みを実装したことがありませんでしたので、この機会にワークショップで勉強してきました！

セッションタイトル

LFS402 | AI を使用した科学的知識の検索と規制への申請の迅速化

LFS402 | Scientific knowledge search & expedited regulatory submissions with AI

セッション概要

このワークショップでは、AI を使用して臨床試験プロセスを加速し、規制当局への提出を合理化し、ライフサイエンス分野のスタッフの生産性を向上させる方法を学びます。　　

AI を使用してコンテキストを認識した応答を提供するスケーラブルなエンタープライズ検索ソリューションを構築する方法をご覧ください。　　

さらに、Amazon Kendra と pgvector 埋め込みを使用して、過去の社内臨床研究や公開情報 (ClinicalTrial.gov など) から重要な情報を抽出する方法を学びます。　　

次に、Amazon Lex を使用してシームレスで応答性の高いユーザーエクスペリエンスを作成する方法を学びます。

ワークショップ前の雰囲気

100人近く着席できる会場にてワークショップが行われました

メインスピーカーは Nora O Sullivan 氏です

ワークショップ前の講義

ワークショップの作業そのものには関係してこないのですが「臨床試験」のプロセスから解説が始まりました。
というのも、スピーカー含めスタッフの皆様は「健康」や「ライフサイエンス」に関わる仕事をしているとのことで、このワークショップでも「臨床試験」をユースケースとして選択したそうです。

その中で、『臨床試験にはモノによっては半年で終わることもありますが、7年から10年ほどの非常に時間掛かることもあるそうで、個々のプロセスで「効率化」が必要不可欠になっている。』ということを強く主張していた部分が印象的でした。

効率化を求められる中、ユーザーは自然言語ではなく、伝統的なSQLクエリなどを使用する必要があるため、最後の最後に適切そうなデータを見つけても「調査し目的に合っているかどうかを手作業でチェックする」という大きな時間のロスやオーバーヘッドが発生しているということが問題であると、Nora O Sullivan 氏は続けました。

臨床試験に適したデータを効率的に見つけるために「Gen AI」の力を使い始めています。という締めくくりでワークショップの説明に移りました。

以下がワークショップを進めるためのアーキテクチャイメージです。
「全てイチから作成していく」という訳ではなく、幾つかの部分ではワークショップのデフォルトとして既にリソースや設定がデプロイされた状態で始まります。

上記のアーキテクチャをベースに4つのラボに分かれたワークショップを進めていくという説明がありました。

ラボ-1

ワークショップアカウントにすでに配置されている臨床試験ドキュメントを使用して、Amazon Kendra インデックスを構築する
このインデックスにより文書が検索可能になり、ワークショップ後半の科学的知識の検索を可能にする

ラボ-2

Amazon SageMaker JumpStart のFlan T-5 モデル(Flan-T5 XXL BNB INT8)を使用して、 Amazon SageMakerモデルのエンドポイントをデプロイする
科学的知識の検索を促進するためにワークショップに必要な実装を作成することに加えて、Amazon Kendra からの結果をプロンプトがどのように利用して、より適切に要約された応答を実現するかを確認する

ラボ-3

Amazon Kendra インデックス名と Amazon SageMaker エンドポイントを反映するために、AWS lambda 関数コードをデプロイおよび更新する
AWS Lambda は、Amazon Lex、Amazon Kendra、Amazon SageMaker エンドポイント間の統合を確立するために使用される

ラボ-4

Amazon Lexを使用してインタラクティブなボットを構築する

いざ、ワークショップ

ラボ-1 Amazon Kendra を使用して科学および臨床文書の検索を可能にする

前述した全体アーキテクチャの下記オレンジ枠の部分を対象に進めていきます。

主な作業

Amazon Kendra インデックスを作成します

Kendra インデックスにデータソースを追加します

Kendra インデックスを使用した検索をテストします

ラボ-2 SageMaker JumpStart を使用した LLM のデプロイ

前述した全体アーキテクチャの下記オレンジ枠の部分を対象に進めていきます。

主な作業

SageMaker ノートブックのセットアップを行います

ノートブックを使用したモデルとの対話をテストします

index_name = "lfs402-setup-Index"

user_input = "What is the demographic for ovarian cancer study?"

kendra = boto3.client('kendra', region_name='us-west-2')

index_id = [i for i in kendra.list_indices()['IndexConfigurationSummaryItems'] 
                if i['Name'] == index_name][0]['Id']

query_out = kendra.retrieve(IndexId=index_id, QueryText=user_input)['ResultItems']
result_1 = str(query_out[0]['Content'])
result_2 = str(query_out[2]['Content'])


prompt = """Given the question: 

    """ + user_input + """

and the following text: 

    1.""" + result_1 + """
    2.""" + result_2 + """

Answer the question based on the text as if you are a pharmaceutical researcher"""

print(prompt)

ラボ-3 AWS Lambda 関数のデプロイメント

前述した全体アーキテクチャの下記オレンジ枠の部分を対象に進めていきます。

主な作業

Lexから呼び出され、SageMakerノートブック・Kendra にリクエストを投げるLambdaを準備します。

"""
This workshop sample function demonstrates an implementation of the Lex Code Hook Interface
in order to serve a sample bot which manages questions about Clinical Trials
Bot, Intent, and Slot models which are compatible with this sample can be found in the Lex Console.
For instructions on how to set up and test this bot, as well as additional samples,
visit the Lex Getting Started documentation http://docs.aws.amazon.com/lex/latest/dg/getting-started.html.

"""
"""
EDIT variables here
"""
#SageMaker endpoint
endpoint= "knowledge-summarization"
#Kendra Index ID
index_id= 'xxxxxx-1111-2222-3333-444455556666'
#Lex intent name
intent1 = "cancerstudy_search_intent"
"""
DON'T EDIT BELOW
"""
import time
import logging
from botocore import client
import io, base64
import boto3
import os, json
import textwrap
# --- Load logger ---
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
convo_sql = True
# --- get sagemaker client ---
sm = boto3.client('sagemaker-runtime', region_name='us-west-2')

# --- get s3 client ---
s3_client = boto3.client('s3', config=client.Config(signature_version='s3v4'))
bucket = "aws-hcls-ml"
# --- make database connection ---
s3 = boto3.resource('s3')
# --- get  client ---
client = boto3.client("runtime.sagemaker")
# --- get kendra index client and utterances ---
kendra = boto3.client('kendra', region_name='us-west-2')

indices =  {"notes": {"index_id": index_id, "utterance": "doctors notes for "},

            "Clinical-trial-transparency-documents": {"index_id": index_id, 

            "utterance": "cancer study "}

}
utterance_to_index_id = {indices[k]['utterance']: indices[k]['index_id'] for k in indices.keys()}
# --- Main handler ---
def lambda_handler(event, context):
    """
    Route the incoming request based on intent.
    The JSON body of the request is provided in the event slot.
    """
    # By default, treat the user request as coming from the America/New_York time zone.
    os.environ['TZ'] = 'America/New_York'
    time.tzset()
    logger.debug('event.bot.name={}'.format(event['bot']['name']))
    logger.info(json.dumps(event))
    return dispatch(event)
# --- Intents ---
def dispatch(intent_request):
    """
    Called when the user specifies an intent for this bot.
    """
    logger.debug('dispatch userId={}, intentName={}'.format(intent_request['userId'], intent_request['currentIntent']['name']))
    intent_name = intent_request['currentIntent']['name']
    print("intent_name", intent_name)
    return talk_to_workshopbot(intent_request)
    raise Exception('Intent with name ' + intent_name + ' not supported')
# --- Functions that control the bot's behavior ---

def talk_to_workshopbot(intent_request):

    # extract user input
    slots = intent_request['currentIntent']['slots']
    query = intent_request['inputTranscript'] if intent_request['inputTranscript'] else None
    session_attributes = intent_request['sessionAttributes'] if intent_request['sessionAttributes'] is not None else {}
    session_attributes['results'] = json.dumps({'HealthQuery': query})
    # parse query as either structured, unstructured
    if query.lower() == "how many patients have the condition asthma?":
        reply = "hello Chris H."
    else:
        user_input = query.lower()
        query_out = kendra.retrieve(IndexId=index_id, QueryText=user_input)['ResultItems']
        result_1 = str(query_out[0]['Content'])
        result_2 = str(query_out[1]['Content'])
        prompt = """Given the question: 
            """ + user_input + """
        and the following text: 
            1.""" + result_1 + """
            2.""" + result_2 + """
        , answer the question based on the text as if you were a pharmaceutical expert writing a new upcoming clinical study protocol:"""
        payload = {"text_inputs": prompt, "max_new_tokens": 500, "early_stopping": True, 

                   "length_penalty": 2.0, "temperature": 0.7} # adding min_lenght of 200 will cause response to take > 30s, "min_length": 200}
        print(f"payload: {payload}")
        response, status = call_llm(payload)
        print("response", response)
        print("status", status)
        generated_text = response["generated_texts"][0]
        print(
            f"generated text: {generated_text}"
            )
        reply = generated_text if status == 200 else response
        print(f"Reply:{reply}")
    session_attributes['reply'] = reply
    return close(session_attributes, 'Fulfilled', {'contentType': 'PlainText','content': reply})
def call_llm(event):
    response = sm.invoke_endpoint(EndpointName=endpoint,
                                  ContentType='application/json',
                                  Body=json.dumps(event).encode())
    result = json.loads(response['Body'].read().decode())
    return result, response['ResponseMetadata']['HTTPStatusCode']
    print(
        f"Response after llm execution:{response}"
        )
    return response, response['ResponseMetadata']['HTTPStatusCode']
# --- Helpers that build all the responses ---
def close(session_attributes, fulfillment_state, message):
    return {
        'sessionAttributes': session_attributes,
        'dialogAction': {
            'type': 'Close',
            'fulfillmentState': fulfillment_state,
            'message': message
        }
    }