Let us see how to extract the entity names from the text data using Azure OpenAI.
Entity extraction is a vital aspect of NLP, involving the identification and extraction of specific entities, such as names, organizations, locations, and contact numbers, from a given text. In the presented code snippet, the task is to identify and extract people’s names, organization names, geographical locations, and contact numbers from various text passages.
The prompt provides clear instructions for the entity extraction task, specifying the entities of interest and their corresponding categories. It includes examples that illustrate how to extract information from different texts, showcasing the versatility of the entity extraction process.
The code utilizes the OpenAI API to generate responses that include extracted entities, such as people’s names, organization names, locations, and contact numbers, from the given text passages. The output is structured in a JSON format, making it easy to parse and integrate the extracted entities into further processing or analysis.
This example demonstrates the practical application of entity extraction for extracting relevant information from diverse textual data, showcasing its potential in various domains, such as customer relationship management, information retrieval, and data analysis:
response = openai.Completion.create(
engine=”gpt3.5 deployment name”,
prompt =
“Identify the individual’s name, organization, geographical location, and contact number in the following text.\n\nHello.
I’m Sarah Johnson, and I’m reaching out on behalf of XYZ Tech Solutions based in Austin, Texas.
Our team believes that our innovative products could greatly benefit your business.
Please feel free to contact me at (555) 123-4567 at your convenience, and we can discuss how our solutions align with your needs.”
,
temperature=0.2,
max_tokens=150,
top_p=1,
frequency_penalty=0,
presence_penalty=0,
stop=None)
print(response[‘choices’])
Here’s the output:
[<OpenAIObject at 0x215d2c40770> JSON: {
“text”: ” Thank you for your time, and I look forward to hearing from you soon.
\n\nName: Sarah Johnson\nOrganization: XYZ Tech Solutions\nGeographical location: Austin, Texas\nContact number: (555) 123-4567″,
“index”: 0,
“finish_reason”: “stop”,
“logprobs”: null,
“content_filter_results”: {
“hate”: {
“filtered”: false,
“severity”: “safe”
},
“self_harm”: {
“filtered”: false,
“severity”: “safe”
},
“sexual”: {
“filtered”: false,
“severity”: “safe”
},
“violence”: {
“filtered”: false,
“severity”: “safe”
}
}
}]
Now let’s extract the required information name, organization, location, and contact information from the output JSON, as follows:
import json
# Parse JSON
json_data = response[‘choices’]
# Extract information
# Extracting information from the JSON object
for entry in json_data:
text = entry.get(“text”, “”)
# Extracting information using string manipulation or regular expressions
name = text.split(“Name:”)[1].split(“\n”)[0].strip()
organization = text.split(“Organization:”)[1].split(“\n”)[0].strip()
location = text.split(“Geographical location:”)[1].split(“\n”)[0].strip()
contact_number = text.split(“Contact number:”)[1].split(“\n”)[0].strip()
# Print the extracted information
print(“Name:”, name)
print(“Organization:”, organization)
print(“Location:”, location)
print(“Contact Number:”, contact_number)
Here’s the output:
Name: Sarah Johnson Organization: XYZ Tech Solutions Location: Austin, Texas Contact Number: (555) 123-4567