欸,最近想做一些nlp的项目,做完了中文的想做做英文的,但是呢,国内爬虫爬取的肯定都是中文 ,爬取外网的技术我没有尝试过,没有把握。所以我决定启用翻译,在这期间chatGPT给了我非常多的方法,但是都因为各种各样的原因一一无效。ps:大骗子GPT!

sample1:

NOTE:无用的youdaoapi包 

sample2:

NOTE:无用的translate包。 

sample3:

NOTE:无用的谷歌包。 

百度api,yyds!非常nice!👑

Step1、申请🎈

【百度翻译api】中文自动翻译为英文-LMLPHP

百度文本翻译api申请!整体流程页面【百度翻译api】中文自动翻译为英文-LMLPHPhttps://console.bce.baidu.com/ai/?_=1652768945367&fromai=1#/ai/machinetranslation/overview/index

 Step2、第二步完成直接看我的代码,只需要API Key和Secret Key便可以使用。🎈


# -*- coding: utf-8 -*-

# This code shows an example of text translation from English to Simplified-Chinese.
# This code runs on Python 2.7.x and Python 3.x.
# You may install `requests` to run this code: pip install requests
# Please refer to `https://api.fanyi.baidu.com/doc/21` for complete api document

import requests
import random
import json

def get_access_token():
    """
    使用 AK,SK 生成鉴权签名(Access Token)
    client_id:API Key
    client_secret:Secret Key
    :return: access_token,或是None(如果错误)
    """
    url = "https://aip.baidubce.com/oauth/2.0/token"
    params = {"grant_type": "client_credentials", "client_id": '5UHGfQaGLKlINhXRv1lA0tl3', "client_secret": 'evGZuz1r14MRElOt638D8GMdheQ9gKZj'}
    return str(requests.post(url, params=params).json().get("access_token"))

def baidu_translate(q):
    token = get_access_token()
    url = 'https://aip.baidubce.com/rpc/2.0/mt/texttrans/v1?access_token=' + token
    
    # For list of language codes, please refer to `https://ai.baidu.com/ai-doc/MT/4kqryjku9#语种列表`
    from_lang = 'zh' # example: en
    to_lang = 'en' # example: zh
    term_ids = '' # 术语库id,多个逗号隔开
    
    
    # Build request
    headers = {'Content-Type': 'application/json'}
    payload = {'q': q, 'from': from_lang, 'to': to_lang, 'termIds' : term_ids}
    
    # Send request
    r = requests.post(url, params=payload, headers=headers)
    result = r.json()
    
    # Show response
    # print(json.dumps(result, indent=4, ensure_ascii=False))
    return result['result']['trans_result'][0]['dst']

效果截图:

【百度翻译api】中文自动翻译为英文-LMLPHP

 

耶耶耶!,尽情享用吧,爬取数据的同时加一个这个函数转成英文再储存为csv等,完美解决这个小小的难题。值得记录一下勒!--<-<-<@🌹

08-12 07:03