Analyze text with Azure AI Language
- Language detection
- Key phrase extraction
- Sentiment analysis
- Named entity recognition
- Entity linking ตอนนี้ไปหาจาก Wiki ให้
การใช้งานเหมือนปกติเลย สร้าง Resource ตามงาน > สร้าง Endpoint + Key เพื่อใช้
- Detect language
เป็น REST โดยมีข้อกำหนด
- document size must be under 5,120 characters
- The size limit is per document and each collection is restricted to 1,000 items (IDs).
- json ที่มี collection of documents และใส่ countryHint to improve prediction performance.
{ "kind": "LanguageDetection", "parameters": { "modelVersion": "latest" }, "analysisInput":{ "documents":[ { "id": "1", "text": "Hello world", "countryHint": "US" }, { "id": "2", "text": "Bonjour tout le monde" } ] } }
JSON response
- detectedLanguage
- confidenceScore 1 ดี 0 แย่
Ref: Detect language - Training | Microsoft Learn
- Extract key phrases
input
{ "kind": "KeyPhraseExtraction", "parameters": { "modelVersion": "latest" }, "analysisInput":{ "documents":[ { "id": "1", "language": "en", "text": "You must be the change you wish to see in the world." }, { "id": "2", "language": "en", "text": "The journey of a thousand miles begins with a single step." } ] } }
- Analyze sentiment
วัดผล ความพอใจ เช่น Product Review / Prioritizing customer service responses
{ "kind": "SentimentAnalysis", "parameters": { "modelVersion": "latest" }, "analysisInput": { "documents": [ { "id": "1", "language": "en", "text": "Good morning!" } ] } }
วัดผลที่ละประโยค positive, negative, and neutral classification values between 0 and 1. ถ้าประโยค มีหลายอันเอามา weight กัน ยกเว้นที่มีประโยค positive and negative อันนี้จะขึ้น mix
- Named Entity Recognition identifier
อะไรที่แตกได้ Entity categories recognized by Named Entity Recognition in Azure AI Language - Azure AI services | Microsoft Learn
{ "kind": "EntityRecognition", "parameters": { "modelVersion": "latest" }, "analysisInput": { "documents": [ { "id": "1", "language": "en", "text": "Joe went to London on Saturday" } ] } }
- Extract linked entities หา ref
Linked entities enable you to disambiguate common entities of the same name. เช่น Paris >> refers to the French city
{ "kind": "EntityLinking", "parameters": { "modelVersion": "latest" }, "analysisInput": { "documents": [ { "id": "1", "language": "en", "text": "I saw Venus shining in the sky" } ] } }
Exercise - Analyze text
Knowledge check - Knowledge check - Training | Microsoft Learn
Create question answering solutions with Azure AI Language
- Understand question answering
The knowledge base can be created from existing sources, including:
- Web sites containing frequently asked question (FAQ) documentation.
- Files containing structured text, such as brochures or user guides.
- Built-in chit chat question and answer pairs that encapsulate common conversational exchanges
- Compare question answering to Azure AI Language understanding
Key | Question answering | Language understanding |
---|---|---|
Usage pattern | User submits a question, expecting an answer | User submits an utterance, expecting an appropriate response or action |
Query processing | Service uses natural language understanding to match the question to an answer in the knowledge base | Service uses natural language understanding to interpret the utterance, match it to an intent, and identify entities |
Response | Response is a static answer to a known question | Response indicates the most likely intent and referenced entities |
Client logic | Client application typically presents the answer to the user | Client application is responsible for performing appropriate action based on the detected intent |
- Create a knowledge base
ref: Create a knowledge base - Training | Microsoft Learn
Azure AI Language resource and create a Custom question answering project. โดยที่เราเอา KM จาก
- URLs for web pages containing FAQs.
- Files containing structured text from which questions and answers can be derived.
- Predefined chit-chat datasets that include common conversational questions and responses in a specified style.
- Implement multi-turn conversation
result เอาผลลัพธ์จาก KM หรือ กำหนด follow-up prompt to the question. เองตามรูปได้เลย
- Test and publish a knowledge base
- Testing a knowledge base - Language Studio, submitting questions and reviewing the answers ดูจาก Confident Score
- Deploying a knowledge base - REST endpoint
- Use a knowledge base
- req
{ "question": "What do I need to do to cancel a reservation?", "top": 2, //Maximum number of answers to be returned. "scoreThreshold": 20, //Score threshold for answers returned. "strictFilters": [ //Limit to only answers that contain the specified metadata. { "name": "category", "value": "api" } ] }
- res
{ "answers": [ { "score": 27.74823341616769, "id": 20, "answer": "Call us on 555 123 4567 to cancel a reservation.", "questions": [ "How can I cancel a reservation?" ], "metadata": [ { "name": "category", "value": "api" } ] } ] }
- Improve question answering performance
Use active learning
- Create your question and answer pairs ให้คำถาม คำตอบ บางส่วนให้เรียนรู้ Import หรือ เพิ่มไปใน Language Studio
- Review suggestions - ตอนที่เราถามตอบตัว Language Studio มีแนะนำด้วย ให้ Accept / Reject แต่ถ้ามันแนะนำคำถามแปลก Add alternate question ได้
- Define synonyms ใช้ REST API to submit synonym
Exercise - Create a question answering solution
Knowledge check - Knowledge check - Training | Microsoft Learn
Q: Enable users to use your knowledge base through email?
A: create a bot for your published knowledge base and configure a channel for email communication.
Build a conversational language understanding model
A common design pattern for a natural language understanding solution
In this design pattern:
- An app accepts natural language input from a user.
- A language model is used to determine semantic meaning (the user's intent).
- The app performs an appropriate action.
- Understand prebuilt capabilities of the Azure AI Language service
Pre-configured features
- Summarization
- Named entity recognition
- Personally identifiable information (PII) detection
- Key phrase extraction
- Sentiment analysis
- Language detection
Learned features
- Conversational language understanding (CLU)
- Custom named entity recognition
- Custom text classification
- Question answering
- Understand resources for building a conversational language understanding model
Build your model
- Use Language Studio
- REST API
- Define intents, utterances, and entities
utterances - คำพูด
intents - สิ่งที่ต้องการ
- Label precisely Label each entity to its right type always. Only include what you want extracted, avoid unnecessary data in your labels
- Label consistently - The same entity should have the same label across all the utterances.
- Label completely - Label all the instances of the entity in all your utterances.
entities - เกี่ยวข้องกับอะไร โดยจะมีกลุ่ม
- Learned
- List - Possible Value
- Pre-Build - ที่ Common มากๆ
Ref: Define intents, utterances, and entities - Training | Microsoft Learn
- Use patterns to differentiate similar utterances
รวบที่คล้ายกัน แบบเดียวกัน
- TurnOnDevice:
- "Turn on the {DeviceName}"
- "Switch on the {DeviceName}"
- "Turn the {DeviceName} on"
- GetDeviceStatus:
- "Is the {DeviceName} on[?]"
- TurnOffDevice:
- "Turn the {DeviceName} off"
- "Switch off the {DeviceName}"
- "Turn off the {DeviceName}"
- Train, test, publish, and review a conversational language understanding model
- Train a model to learn intents and entities from sample utterances.
- Test the model interactively or using a testing dataset with known labels
- Deploy a trained model to a public endpoint so client apps can use it
- Review predictions and iterate on utterances to train your model
Exercise - Build an Azure AI services conversational language understanding model
Knowledge check - Build a conversational language understanding model - Knowledge check
Create a custom text classification solution
Understand the types of classification projects
- Single label classification - you can assign only one class to each file. Following the above example, a video game summary could only be classified as "Adventure" or "Strategy".
- Multiple label classification - you can assign multiple classes to each file. This type of project would allow you to classify a video game summary as "Adventure" or "Adventure and Strategy".
Labeling data correctly, especially for multiple label projects, is directly correlated with how well your model performs. The higher the quality, clarity, and variation of your data set is, the more accurate your model will be.
- Evaluating and improving your model
ปัญหา
- False positive - model predicts x, but the file isn't labeled x.
- False negative - model doesn't predict label x, but the file in fact is labeled x.
ทีนี่มี measures มาช่วยตรวจ
- Recall - Of all the actual labels, how many were identified; the ratio of true positives to all that was labeled.
- Precision - How many of the predicted labels are correct; the ratio of true positives to all identified positives.
- F1 Score - A function of recall and precision, intended to provide a single score to maximize for a balance of each component
การใช้แบบนี้ยิงผ่าน API
- Understand how to build text classification projects
- Define labels:
- Tag data:
- Train model: split datasets for training 2 ส่วนได้แก่ train / test โดยสามารถให้ Azure AI Language Automatic split หรือ Manual split
- View model: ดู Model Score / Measure ต่างๆ
- Improve model:
For example, you might find your model mixes up "Adventure" and "Strategy" games. Try to find more examples of each label to add to your dataset for retraining your model.
- Deploy model: ตรงนี้ระบบรองรับ Version / หลาย Model ทำให้สามารถ Compare และทำ REST API ให้ Client เลือกใช้ได้
- Classify text:
Ref: Sample API - Understand how to build text classification projects - Training | Microsoft Learn
Exercise - Classify text
Knowledge check - https://learn.microsoft.com/en-us/training/modules/custom-text-classification/5-knowledge-check
Q: a classification task via the API. How do you get the results of the classification?
A: Get the value from theoperation-location
header in the request response, and use that to retrieve the results of the classification request.**
Custom named entity recognition
- To call a built-in NER
<YOUR-ENDPOINT>/language/analyze-text/jobs?api-version=<API-VERSION>
- Azure AI Language project life cycle
คล้ายๆกับของก่อนหน้าเลย ต่างที่
1. Define entities: Understand the data and entities you want to identify, and try to make them as clear as possible. For example, defining exactly which parts of a bank statement you want to extract.
For the best performance, you'll need to use both high quality data to train the model and clearly defined entity types. เน้นข้อมูล
- Diversity (หลายหลาย)
- Distribution (กระจาย)
- Accuracy (Real Data)
2 Tag data: Label, or tag, your existing data, specifying what text in your dataset corresponds to which entity. This step is important to do accurately and completely
- Consistency
- Precision เป็นผลจาก Consistency
- Completeness Label จาก Language Studio หรือ REST API 3-tag-your-data - Training | Microsoft Learn
3. Train model
The confusion matrix allows you to visually identify where to add data to improve your model's performance. More Detail Train and evaluate your model - Training | Microsoft Learn
4. View model: After your model is trained, view the results of the model. This page includes a score of 0 to 1 that is based on the precision and recall of the data tested. You can see which entities worked well (such as customer name) and which entities need improvement (such as account number).
5. Improve model: Find out what data needs to be added to your model's training to improve performance.
- what data needs to be added to your model's training to improve performance
- how entities failed, and which entities (such as account number) need to be differentiated from other similar entities (such as loan amount).
**You've trained your model and you're seeing that it doesn't recognize your entities. What metric score is likely low to indicate that issue?
Recall indicates how well the model extracts entities, regardless of which entity that is.**
6. Deploy model:
7. Extract entities: Use your model for extracting entities.
Exercise - Extract custom entities
Knowledge check - Knowledge check - Training | Microsoft Learn
Translate text with Azure AI Translator service
Azure AI Translator
- Language detection.
curl -X POST "https://api.cognitive.microsofttranslator.com/detect?api-version=3.0" -H "Ocp-Apim-Subscription-Region: <your-service-region>" -H "Ocp-Apim-Subscription-Key: <your-key>" -H "Content-Type: application/json" -d "[{ 'Text' : 'こんにちは' }]
- One-to-many translation.
curl -X POST "https://api.cognitive.microsofttranslator.com/translate?api-version=3.0&from=ja&to=fr&to=en" -H "Ocp-Apim-Subscription-Key: <your-key>" -H "Ocp-Apim-Subscription-Region: <your-service-region>" -H "Content-Type: application/json; charset=UTF-8" -d "[{ 'Text' : 'こんにちは' }]"
- Script transliteration (converting text from its native script to an alternative script).
curl -X POST "https://api.cognitive.microsofttranslator.com/transliterate?api-version=3.0&fromScript=Jpan&toScript=Latn" -H "Ocp-Apim-Subscription-Key: <your-key>" -H "Ocp-Apim-Subscription-Region: <your-service-region>" -H "Content-Type: application/json" -d "[{ 'Text' : 'こんにちは' }]"
Sample API Understand language detection, translation, and transliteration - Training | Microsoft Learn แต่ตอนใช้ต้องใส่ subscription key ด้วย
translation options
- Word alignment (alignment) บอกว่าที่แปลก ตัวอักษรนั้นๆมาจากไหน เพราะบางภาษาไม่มี Space แบบ Eng
- includeSentenceLength (sentLen) บอกความยาวประโยต ก่อนและหลังแปล
- Profanity กรองคำหยาบ (NoAction / Deleted / Marked **)
Define custom translations
กรณีที่ต้องแปลศัพท์เฉพาะในธุรกิจ เราสามารถเพิ่ม Training Data เข้าไปได้ โดย Azure AI จะแยก category
curl -X POST "https://api.cognitive.microsofttranslator.com/translate?api-version=3.0&from=en&to=nl&category=<category-id>" -H "Ocp-Apim-Subscription-Key: <your-key" -H "Content-Type: application/json; charset=UTF-8" -d "[{'Text':'Where can I find my employee details?'}]"
Exercise - Translate text with the Azure AI Translator service
Knowledge check - Knowledge check - Training | Microsoft Learn
Create speech-enabled apps with Azure AI services
Azure AI Speech resource feature
- Speech to text
- Text to speech
- Speech Translation
- Speaker Recognition: An API that enables your application to recognize individual speakers based on their voice. แยกเสียงตามคน
- Intent Recognition: An API that uses conversational language understanding to determine the semantic meaning of spoken input. //chatbot แบบฟังจากเสียง
Using the Azure AI Speech SDK
- Speech to text via SpeechReconignizer object
- text to speech via SpeechSynthesizer object
- ผลของ SDK มี Reason property บอกนะ
- SpeechConfig
- SetSpeechSynthesisOutputFormat บอกว่า Audio file type / Sample-rate / Bit-depth
- SpeechSynthesisVoiceName เสียงพูดใคร ในระบบ
speechConfig.SetSpeechSynthesisOutputFormat(SpeechSynthesisOutputFormat.Riff24Khz16BitMonoPcm); speechConfig.SpeechSynthesisVoiceName = "en-GB-George";
- Speech Synthesis Markup Language (SSML) syntax คุมการอ่านเสียง โทน เช่น
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="en-US"> <voice name="en-US-AriaNeural"> <mstts:express-as style="cheerful"> I say tomato </mstts:express-as> </voice> <voice name="en-US-GuyNeural"> I say <phoneme alphabet="sapi" ph="t ao m ae t ow"> tomato </phoneme>. <break strength="weak"/>Lets call the whole thing off! </voice> </speak>
Exercise - Create a speech-enabled app
Knowledge check - https://learn.microsoft.com/en-us/training/modules/create-speech-enabled-apps/8-knowledge-check
Translate speech with the Azure AI Speech service
- Translate speech to text
Reason property has the enumerated value RecognizedSpeech, the Text property contains the transcription in the original language. You can also access a Translations property which contains a dictionary of the translations (using the two-character ISO language code, such as "en" for English, as a key).
Synthesize translations
- Event-based synthesis (TranslationConfig create event hander) - perform 1:1 translation (translating from one source language into a single target language)
- Manual synthesis (Call TranslationRecognizer ตรง**)**
Exercise - Translate speech
Knowledge check - Translate speech with the Azure AI Speech service - Knowledge check
Reference
Discover more from naiwaen@DebuggingSoft
Subscribe to get the latest posts sent to your email.