6. Call Azure AI Search from Python
This application explains how to call Azure AI Search from Python.
Prerequisites
- Python 3.10 or later
- Azure AI Search
- Azure OpenAI Service
Overview
Azure AI Search (formerly known as Azure Cognitive Search) is a fully managed cloud search service that provides information retrieval over user-owned content. Data plane REST APIs are used for indexing and query workflows, and are documented in this section. Azure AI Search REST API reference provides detailed information about the APIs.
REST API specs in OpenAPI format are available in the Azure/azure-rest-api-specs repository.
Samples for Azure Cognitive Search client library for Python are available in the Azure SDK for Python repository.
Azure AI Search client library for Python - version 11.5.1 provides primitive APIs for working with Azure AI Search. It is flexible and allows you to work with the service at a lower level.
Introducing LangChain
LangChain is a framework for developing applications powered by large language models (LLMs). It provides a set of tools and libraries to help you build, train, and deploy LLMs in production.
On the other hand, for example, the OpenAI Python SDK provides a direct interface to OpenAI's API, enabling developers to integrate OpenAI's powerful language models into their applications
The relationship between LangChain and the OpenAI Python SDK is complementary. LangChain leverages the OpenAI Python SDK to access and utilize OpenAI's models, providing a higher-level abstraction that simplifies the integration of these models into more complex workflows and applications.
Use LangChain to access Azure AI Search easily
Azure AI Search interface in LangChain provides a simple and easy way to access Azure AI Search from Python.
Use RecursiveCharacterTextSplitter to recursively split text by characters
It is necessary to split text by characters when you need to put text into a search index.
Implementing text splitting by characters is a common task in natural language processing (NLP) and information retrieval (IR) applications but it is tedious and error-prone.
So we introduce RecursiveCharacterTextSplitter
which provides a simple and easy way to recursively split text by characters. Details are available in the following link.
Usage
- Get the API key for Azure AI Search
- Copy .env.template to
.env
in the same directory - Set credentials in
.env
- Run scripts in the apps/6_call_azure_ai_search directory
[!CAUTION] >
AZURE_AI_SEARCH_INDEX_NAME
in.env
should be unique and should not be changed once set. If you change the index name, you will need to recreate the index and re-upload the documents.
Set up the environment and install dependencies:
# Create a virtual environment
$ python -m venv .venv
# Activate the virtual environment
$ source .venv/bin/activate
# Install dependencies
$ pip install -r requirements.txt
Create an index in Azure AI Search and upload documents:
[!CAUTION] This script should be run only once to avoid creating duplicate indexes.
$ INDEX_NAME=yourindexname
$ FILE=./datasets/yourfile.csv
$ python apps/6_call_azure_ai_search/1_create_index.py \
--index-name $INDEX_NAME \
--file $FILE \
--verbose
Search documents in Azure AI Search:
$ INDEX_NAME=yourindexname
$ python apps/6_call_azure_ai_search/2_search_docs.py \
--index-name $INDEX_NAME \
--query "meeting" \
--verbose
> All meetings must include a 5-minute meditation session.
> All meetings must begin with a joke.
> All meetings must have a theme, such as pirate or superhero.