Skip to main content

支持多达90多种语言翻译 - 介绍Azure Translator服务文本翻译服务

分类:  Azure认知服务 标签:  #Azure #人工智能 #语言理解(LUIS) #Translator 发布于: 2023-06-10 21:24:44

微软Azure平台提供了一个非常强大的翻译服务,该服务主要是基于微软的机器学习和深度机器学习的基础理论研发的产品,该产品支持多大90多种语言的翻译,同时根据使用的场景,微软提供了三种API:

  • 文本翻译API
  • 文档翻译API
  • 自定义翻译API

本章我们快速并且简要的向大家演示一下文本翻译服务的基本功能和使用流程。

创建Azure Translator服务

登录到Azure的Portal上,在Marketing上搜索translator, 搜索出相应的服务之后,点击创建,可以按照如下图来填充信息,然后点击开始创建:


创建完成后,请记录下keyendpointregion等信息


创建实例应用

本章的实例代码均可以从如下的位置下载:https://github.com/hylinux/azure-demo/tree/main/dotnet/cognitive-service/Translator/TextTranslator

创建应用

我们使用dotnet作为开发实例的工具,使用如下的命令来创建一个应用

dotnet new console -n TextTranslator
cd TextTranslator
dotnet add package Newtonsoft.Json

打开文件Program.cs, 在文件头添加:

using System;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json; // Install Newtonsoft.Json with NuGet

同时在类Program中添加如下三个属性:

    private static readonly string subscriptionKey = "YOUR-SUBSCRIPTION-KEY";
    private static readonly string endpoint = "https://api.cognitive.microsofttranslator.com/";

    // Add your location, also known as region. The default is global.
    // This is required if using a Cognitive Services resource.
    private static readonly string location = "YOUR_RESOURCE_LOCATION";

我们需要将上述三个属性改变为您创建的结果拿到的三个属性。

基本原理

非常简单,仅仅是在代码里创建一个访问rest api的客户端,通过该客户端向api endpoint提交参数。

这里有两类参数,一类是放入header里的参数,另一类是需要放置在终结点上的参数,关于终结点上的参数请参考API文档:https://docs.microsoft.com/zh-cn/azure/cognitive-services/translator/reference/v3-0-reference

Header里参数参考:

参数名称参数说明
Ocp-Apim-Subscription-Key身份认证Key, 必须
Ocp-Apim-Subscription-Region指定区域,可选
Content-Type值只能为: application/json; charset=UTF-8
X-ClientTraceId可选参数,客户端生成的 GUID,用于唯一标识请求。 如果在查询字符串中使用名为 ClientTraceId 的查询参数包括了跟踪 ID,则可以省略此标头

翻译文本

这里需要注意的一件事是翻译目的的语言代码,关于这个部分请参考:https://docs.microsoft.com/zh-cn/azure/cognitive-services/translator/language-support

本例中是将英文翻译成简体中文意大利语, 我们先方法TextTranslator 用于文本翻译

private static async Task TextTranslator()
{
    //翻译为简体中文和意大利语
    string route = "/translate?api-version=3.0&from=en&to=zh-Hans&to=it";
    string textToTranslate = "I like this Service, it will be help us on translator function!";
    object[] body = new object[] { new { Text = textToTranslate } };
    var requestBody = JsonConvert.SerializeObject(body);

    using (var client = new HttpClient())
    using (var request = new HttpRequestMessage())
    {
        // Build the request.
        request.Method = HttpMethod.Post;
        request.RequestUri = new Uri(endpoint + route);
        request.Content = new StringContent(requestBody, Encoding.UTF8, "application/json");
        request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
        request.Headers.Add("Ocp-Apim-Subscription-Region", location);

        // Send the request and get response.
        HttpResponseMessage response = await client.SendAsync(request).ConfigureAwait(false);
        // Read response as a string.
        string result = await response.Content.ReadAsStringAsync();
        Console.WriteLine(result);
    }
}

如果我们需要翻译文档的话,请使用如下的代码, 直接更改Main()函数如下:

//文本翻译Demo
await TextTranslator();

运行结果如下:

[{"translations":[{"text":"我喜欢这个服务器,它将帮助我们在翻译功能!","to":"zh-Hans"},{"text":"Mi piace questo server, ci aiuterà nella funzione traduttore!","to":"it"}]}]

语言检测

加入你拿到了一句自己不认识的语言,那么你就没有办法通过API参数from= 指定语言,你可以让API自动检查语言。

翻译的过程中让API自行检测语言并翻译

只需要在API请求的过程中,不添加参数from即可让API自行猜测语言并翻译,定义如下的GuessSourceTextTranslator()方法。

private static async Task GuessSourceTextTranslator()
{
    //翻译为简体中文和意大利语
    //注意这里我们没有添加参数from, 让API自行猜测源语言。
    string route = "/translate?api-version=3.0&to=zh-Hans&to=it";
    string textToTranslate = "I like this Server, it will be help us on translator function!";
    object[] body = new object[] { new { Text = textToTranslate } };
    var requestBody = JsonConvert.SerializeObject(body);

    using (var client = new HttpClient())
    using (var request = new HttpRequestMessage())
    {
        // Build the request.
        request.Method = HttpMethod.Post;
        request.RequestUri = new Uri(endpoint + route);
        request.Content = new StringContent(requestBody, Encoding.UTF8, "application/json");
        request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
        request.Headers.Add("Ocp-Apim-Subscription-Region", location);

        // Send the request and get response.
        HttpResponseMessage response = await client.SendAsync(request).ConfigureAwait(false);
        // Read response as a string.
        string result = await response.Content.ReadAsStringAsync();
        Console.WriteLine(result);
    }
}

运行结果如下:

[{"detectedLanguage":{"language":"en","score":1.0},"translations":[{"text":"我喜欢这个服务器,它将帮助我们在翻译功能!","to":"zh-Hans"},{"text":"Mi piace questo server, ci aiuterà nella funzione traduttore!","to":"it"}]}]
只检测语言而不翻译

我们可以使用API detect来探测语言,但是不翻译

private static async Task DetectLanguage()
{
    //不翻译, 只检测语言
    //注意这里我们没有添加参数from, 让API自行猜测源语言。
    string route = "/detect?api-version=3.0";
    string textToTranslate = "I like this Server, it will be help us on translator function!";
    object[] body = new object[] { new { Text = textToTranslate } };
    var requestBody = JsonConvert.SerializeObject(body);

    using (var client = new HttpClient())
    using (var request = new HttpRequestMessage())
    {
        // Build the request.
        request.Method = HttpMethod.Post;
        request.RequestUri = new Uri(endpoint + route);
        request.Content = new StringContent(requestBody, Encoding.UTF8, "application/json");
        request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
        request.Headers.Add("Ocp-Apim-Subscription-Region", location);

        // Send the request and get response.
        HttpResponseMessage response = await client.SendAsync(request).ConfigureAwait(false);
        // Read response as a string.
        string result = await response.Content.ReadAsStringAsync();
        Console.WriteLine(result);
    }
}

运行结果如下:

[{"language":"en","score":1.0,"isTranslationSupported":true,"isTransliterationSupported":false}]

关于更多的详细的实例,例如文本直译翻译字符计数基于字典翻译等等,您可以参考微软的官方文档: https://docs.microsoft.com/zh-cn/azure/cognitive-services/translator/quickstart-translator?tabs=csharp#get-sentence-length

以上就是对文本翻译的快速学习,您可以非常快速的将翻译功能集成到您的应用中,如果您还有其他的问题,欢迎和我联系。