代码样例-Http隧道

本文档包含编程请求http隧道的代码样例,供开发者参考。

代码样例使用说明

  1. 代码样例不能直接运行,因为代码中的隧道服务器域名mytunnelhost、端口mytunnelport、隧道idmytid、密码mypassword都是虚构的,您替换成自己真实的信息,就可以正常运行了。查看我的隧道信息>>
  2. 代码样例正常运行所需的运行环境和注意事项在样例末尾均有说明,使用前请仔细阅读。
  3. 使用代码样例过程中遇到问题请联系售后客服,我们会为您提供技术支持。

Python2

requests

requests(推荐)

使用提示

  1. 基于requests的代码样例支持访问http,https网页,推荐使用。
  2. requests不是python原生库,需要安装才能使用: pip install requests
# -*- coding: utf-8 -*-


""" HTTP隧道代理 request样例 使用requests请求代理服务器
    请求http和https网页均适用
"""

import requests

# 要访问的目标网页
page_urls = ["http://dev.kdlapi.com/testproxy",
             "https://dev.kdlapi.com/testproxy",
             ]

# 隧道服务器
tunnel_host = "tps136.kdlapi.com"
tunnel_port = "15818"

# 隧道id和密码
tid = ""
password = ""

proxies = {
    "http": "http://%s:%s@%s:%s/" % (tid, password, tunnel_host, tunnel_port),
    "https": "http://%s:%s@%s:%s/" % (tid, password, tunnel_host, tunnel_port)
}

headers = {
    "Accept-Encoding": "Gzip",  # 使用gzip压缩传输数据让访问更快
}

for url in page_urls:
    r = requests.get(url, proxies=proxies, headers=headers)

    print r.status_code  # 获取Reponse的返回码

    if r.status_code == 200:
        r.enconding = "utf-8"  # 设置返回内容的编码
        print r.content  # 获取页面内容

urllib2

urllib2

使用提示

  • 基于urllib2的代码样例同时支持访问http和https网页
  • 运行环境要求 python2.6 / 2.7
# -*- coding: utf-8 -*-


"""隧道代理urllib样例使用urllib2请求代理服务器
    请求http和https网页均适用
"""

import urllib2
import zlib
import ssl

ssl._create_default_https_context = ssl._create_unverified_context  # 全局取消证书验证,避免访问https网页报错

#要访问的目标网页
page_urls = ["http://dev.kdlapi.com/testproxy",
             "https://dev.kdlapi.com/testproxy",
             ]

# 隧道服务器
tunnel_host = "tps136.kdlapi.com"
tunnel_port = "15818"

# 隧道id和密码
tid = ""
password = ""

proxies = {
    "http": "http://%s:%s@%s:%s/" % (tid, password, tunnel_host, tunnel_port),
    "https": "http://%s:%s@%s:%s/" % (tid, password, tunnel_host, tunnel_port)
}

for url in page_urls:
    req = urllib2.Request(url)
    req.add_header("Accept-Encoding", "Gzip") #使用gzip压缩传输数据让访问更快
    proxy_hander = urllib2.ProxyHandler(proxies)
    opener = urllib2.build_opener(proxy_hander)
    urllib2.install_opener(opener)
    r = urllib2.urlopen(req)

    print r.code
    content_encoding = r.headers.getheader("Content-Encoding")
    if content_encoding and "gzip" in content_encoding:
        print zlib.decompress(r.read(), 16+zlib.MAX_WBITS) #获取页面内容
    else:
        print r.read() #获取页面内容

Python3

requests

requests(推荐)

使用提示

  1. 基于requests的代码样例支持访问http,https网页,推荐使用
  2. requests不是python原生库,需要安装才能使用: pip install requests
# -*- coding: utf-8 -*-

"""
 @File            :  tps_proxy_request.py
 @description     :  Python3 HTTP隧道代理 request样例 使用requests请求代理服务器
                        请求http和https网页均适用
"""

import requests

# 要访问的目标网页
page_urls = ["http://dev.kdlapi.com/testproxy",
             "https://dev.kdlapi.com/testproxy",
             ]

# 隧道服务器
tunnel_host = "tps136.kdlapi.com"
tunnel_port = "15818"

# 隧道用户名密码
tid = ""
password = ""

proxies = {
        "http": "http://%s:%s@%s:%s/" % (tid, password, tunnel_host, tunnel_port),
        "https": "https://%s:%s@%s:%s/" % (tid, password, tunnel_host, tunnel_port)
    }
headers = {
    "Accept-Encoding": "Gzip",  # 使用gzip压缩传输数据让访问更快
}

for url in page_urls:
    r = requests.get(url, proxies=proxies, headers=headers)

    print(r.status_code)  # 获取Reponse的返回码

    if r.status_code == 200:
        r.enconding = "utf-8"  # 设置返回内容的编码
        print(r.content)  # 获取页面内容

urllib

urllib

使用提示

  • 基于urllib的代码样例同时支持访问http和https网页
  • 运行环境要求 python3.x
import urllib.request
import zlib
import ssl

ssl._create_default_https_context = ssl._create_unverified_context  # 全局取消证书验证,避免访问https网页报错

"""使用urllib.request模块请求代理服务器,http和https网页均适用"""

#要访问的目标网页
page_urls = ["http://dev.kdlapi.com/testproxy",
             "https://dev.kdlapi.com/testproxy",
             ]

# 隧道服务器
tunnel_host = "mytunnelhost"
tunnel_port = "mytunnelport"

# 隧道id和密码
tid = "mytid"
password = "mypassword"

proxies = {
        "http": "http://%s:%s@%s:%s/" % (tid, password, tunnel_host, tunnel_port),
        "https": "http://%s:%s@%s:%s/" % (tid, password, tunnel_host, tunnel_port)
    }

headers = {
    "Accept-Encoding": "Gzip",  # 使用gzip压缩传输数据让访问更快
}

for url in page_urls:
    proxy_hander = urllib.request.ProxyHandler(proxies)
    opener = urllib.request.build_opener(proxy_hander)

    req = urllib.request.Request(url=url, headers=headers)

    result = opener.open(req)
    print(result.status)  # 获取Response的返回码

    content_encoding = result.headers.get('Content-Encoding')
    if content_encoding and "gzip" in content_encoding:
        print(zlib.decompress(result.read(), 16 + zlib.MAX_WBITS).decode('utf-8'))  # 获取页面内容
    else:
        print(result.read().decode('utf-8'))  # 获取页面内容

Python-Scrapy

scrapy项目标准目录结构如下:
scrapy项目结构

使用提示

  1. http/https网页均可适用
  2. scrapy不是python原生库,需要安装才能使用: pip install scrapy
  3. 在第一级scrapy_proxy目录下运行如下命令查看结果:scrapy crawl testproxy
middlewares.py

middlewares.py里添加如下代码进行代理设置

# -*- coding: utf-8 -*-

# Define here the models for your spider middleware
#
# See documentation in:
# https://docs.scrapy.org/en/latest/topics/spider-middleware.html

from scrapy import signals
import logging
import base64


logger = logging.getLogger(__name__)
# 隧道id和密码
tid = "隧道id"
password = "隧道密码"
# 隧道host和端口
tunnel_master_host = "tps136.kdlapi.com"
tunnel_master_port = 15818
# 备用隧道host和端口
tunnel_slave_host = "tps168.kdlapi.com"
tunnel_slave_port = 15818
# 切换阀值
threshold = 3


# 代理中间件
class ProxyDownloadMiddleware(object):

    def process_request(self, request, spider):
        global threshold
        if threshold > 0:
            host, port = tunnel_master_host, tunnel_master_port
        else:
            host, port = tunnel_slave_host, tunnel_slave_port
        if request.url.startswith("http://"):
            proxy_url = 'http://{host}:{port}'.format(host=host, port=port)
        elif request.url.startswith("https://"):
            proxy_url = 'https://{host}:{port}'.format(host=host, port=port)
        request.meta['proxy'] = proxy_url  # 设置代理
        logger.debug("using proxy: {}".format(request.meta['proxy']))
        # 隧道代理需要进行身份验证
        #
        # 用户名和密码需要先进行base64编码,然后再赋值
        username_password = "{tid}:{password}".format(tid=tid, password=password)
        b64_username_password = base64.b64encode(username_password.encode('utf-8'))
        request.headers['Proxy-Authorization'] = 'Basic ' + b64_username_password.decode('utf-8')
        threshold -= 1
        return None

settings.py

settings.py里设置DOWNLOADER_MIDDLEWARES使新增的middleware生效

ROBOTSTXT_OBEY = False  # 将此变量设为False, 提高成功率
DOWNLOADER_MIDDLEWARES = {
'scrapy_proxy_tps.middlewares.ProxyDownloadMiddleware': 100,
}

testproxy.py

在spiders目录下手动创建爬虫文件main.py

# -*- coding: utf-8 -*-
import scrapy


class TestproxySpider(scrapy.Spider):
    name = 'testproxy'
    allowed_domains = ['kdlapi.com']
    start_urls = ['http://dev.kdlapi.com/testproxy']

    def parse(self, response):
        print(response.text)

Java

jdk

使用原生库

使用提示

  1. 此样例同时支持访问http和https网页
  2. 运行环境要求 jdk >= 1.6
package com.kuaidaili.sdk;

import java.util.HashMap;
import java.util.Map;

/**
 * 使用jdk原生库请求代理服务器
 * 请求http和https网页均适用
 */
public class TestProxy {

    private static String pageUrl1 = "http://dev.kdlapi.com/testproxy"; //要访问的目标网页 http
    private static String pageUrl2 = "https://dev.kdlapi.com/testproxy"; //要访问的目标网页 https
    private static String tunnelHost = "mytunnelhost"; //隧道服务器域名
    private static String tunnelPort = "mytunnelport"; //隧道服务器端口
    private static String username = "myusername"; //隧道id
    private static String password = "mypassword"; //密码

    public static void main(String[] args) {
        HttpRequest request = new HttpRequest();
        Map<String, String> params = new HashMap<String, String>();
        Map<String, String> headers = new HashMap<String, String>();

        headers.put("Accept-Encoding", "gzip"); //使用gzip压缩传输数据让访问更快

        Map<String, String> proxySettings = new HashMap<String, String>();
        proxySettings.put("ip", tunnelHost);
        proxySettings.put("port", tunnelPort);
        proxySettings.put("username", username);
        proxySettings.put("password", password);

        try{
            HttpResponse response = request.sendGet(pageUrl1, params, headers, proxySettings);
            System.out.println(response.getCode());
            System.out.println(response.getContent());
        }
        catch (Exception e) {
            e.printStackTrace();
        }

        try{
            HttpResponse response = request.sendGet(pageUrl2, params, headers, proxySettings);
            System.out.println(response.getCode());
            System.out.println(response.getContent());
        }
        catch (Exception e) {
            e.printStackTrace();
        }           
    }
}
查看工具类HttpRequest和HttpResponse

HttpRequest.java

package com.kuaidaili.sdk;

import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.Authenticator;
import java.net.HttpURLConnection;
import java.net.InetSocketAddress;
import java.net.PasswordAuthentication;
import java.net.Proxy;
import java.net.URL;
import java.net.URLEncoder;
import java.nio.charset.Charset;
import java.util.Map;
import java.util.Vector;
import java.util.zip.GZIPInputStream;

/**
 * HTTP请求对象
 */
public class HttpRequest {

    private String defaultContentEncoding;
    private int connectTimeout = 1000;
    private int readTimeout = 1000;

    public HttpRequest() {
        this.defaultContentEncoding = Charset.defaultCharset().name();
    }

    /**
     * 发送GET请求
     *
     * @param urlString URL地址
     * @param proxySettings 代理设置,null代表不设置代理
     * @return 响应对象
     */
    public HttpResponse sendGet(String urlString, final Map<String, String> proxySettings) throws IOException {
        return this.send(urlString, "GET", null, null, proxySettings);
    }

    /**
     * 发送GET请求
     *
     * @param urlString URL地址
     * @param params 参数集合
     * @param proxySettings 代理设置,null代表不设置代理
     * @return 响应对象
     */
    public HttpResponse sendGet(String urlString, Map<String, String> params, final Map<String, String> proxySettings)
            throws IOException {
        return this.send(urlString, "GET", params, null, proxySettings);
    }

    /**
     * 发送GET请求
     *
     * @param urlString URL地址
     * @param params 参数集合
     * @param headers header集合
     * @param proxySettings 代理设置,null代表不设置代理
     * @return 响应对象
     */
    public HttpResponse sendGet(String urlString, Map<String, String> params,
            Map<String, String> headers, final Map<String, String> proxySettings) throws IOException {
        return this.send(urlString, "GET", params, headers, proxySettings);
    }

    /**
     * 发送POST请求
     *
     * @param urlString URL地址
     * @param proxySettings 代理设置,null代表不设置代理
     * @return 响应对象
     */
    public HttpResponse sendPost(String urlString, final Map<String, String> proxySettings) throws IOException {
        return this.send(urlString, "POST", null, null, proxySettings);
    }

    /**
     * 发送POST请求
     *
     * @param urlString URL地址
     * @param params 参数集合
     * @param proxySettings 代理设置,null代表不设置代理
     * @return 响应对象
     */
    public HttpResponse sendPost(String urlString, Map<String, String> params, final Map<String, String> proxySettings)
            throws IOException {
        return this.send(urlString, "POST", params, null, proxySettings);
    }

    /**
     * 发送POST请求
     *
     * @param urlString URL地址
     * @param params 参数集合
     * @param headers header集合
     * @param proxySettings 代理设置,null代表不设置代理
     * @return 响应对象
     */
    public HttpResponse sendPost(String urlString, Map<String, String> params,
            Map<String, String> headers, final Map<String, String> proxySettings) throws IOException {
        return this.send(urlString, "POST", params, headers, proxySettings);
    }

    /**
     * 发送HTTP请求
     */
    private HttpResponse send(String urlString, String method,
            Map<String, String> parameters, Map<String, String> headers, final Map<String, String> proxySettings)
            throws IOException {
        HttpURLConnection urlConnection = null;

        if (method.equalsIgnoreCase("GET") && parameters != null) {
            StringBuffer param = new StringBuffer();
            int i = 0;
            for (String key : parameters.keySet()) {
                if (i == 0)
                    param.append("?");
                else
                    param.append("&");
                param.append(key).append("=").append(URLEncoder.encode(parameters.get(key), "utf-8"));
                i++;
            }
            urlString += param;
        }
        URL url = new URL(urlString);
        if(proxySettings != null){
            Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(proxySettings.get("ip"), Integer.parseInt(proxySettings.get("port"))));
            urlConnection = (HttpURLConnection) url.openConnection(proxy);
            if(proxySettings.containsKey("username")){
                Authenticator authenticator = new Authenticator() {
                    public PasswordAuthentication getPasswordAuthentication() {
                        return (new PasswordAuthentication(proxySettings.get("username"),
                                proxySettings.get("password").toCharArray()));
                    }
                };
                Authenticator.setDefault(authenticator);
            }
        }
        else{
            urlConnection = (HttpURLConnection) url.openConnection();
        }

        urlConnection.setRequestMethod(method);
        urlConnection.setDoOutput(true);
        urlConnection.setDoInput(true);
        urlConnection.setUseCaches(false);

        urlConnection.setConnectTimeout(connectTimeout);
        urlConnection.setReadTimeout(readTimeout);

        if (headers != null)
            for (String key : headers.keySet()) {
                urlConnection.addRequestProperty(key, headers.get(key));
            }

        if (method.equalsIgnoreCase("POST") && parameters != null) {
            StringBuffer param = new StringBuffer();
            int i = 0;
            for (String key : parameters.keySet()) {
                if(i > 0) param.append("&");
                param.append(key).append("=").append(URLEncoder.encode(parameters.get(key), "utf-8"));
                i++;
            }
            System.out.println(param.toString());
            urlConnection.getOutputStream().write(param.toString().getBytes());
            urlConnection.getOutputStream().flush();
            urlConnection.getOutputStream().close();
        }

        return this.makeContent(urlString, urlConnection);
    }

    /**
     * 得到响应对象
     */
    private HttpResponse makeContent(String urlString,
            HttpURLConnection urlConnection) throws IOException {
        HttpResponse response = new HttpResponse();
        try {
            InputStream in = urlConnection.getInputStream();
            BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(in));
            if ("gzip".equals(urlConnection.getContentEncoding())) bufferedReader =  new BufferedReader(new InputStreamReader(new GZIPInputStream(in)));
            response.contentCollection = new Vector<String>();
            StringBuffer temp = new StringBuffer();
            String line = bufferedReader.readLine();
            while (line != null) {
                response.contentCollection.add(line);
                temp.append(line).append("\r\n");
                line = bufferedReader.readLine();
            }
            bufferedReader.close();

            String encoding = urlConnection.getContentEncoding();
            if (encoding == null)
                encoding = this.defaultContentEncoding;

            response.urlString = urlString;

            response.defaultPort = urlConnection.getURL().getDefaultPort();
            response.file = urlConnection.getURL().getFile();
            response.host = urlConnection.getURL().getHost();
            response.path = urlConnection.getURL().getPath();
            response.port = urlConnection.getURL().getPort();
            response.protocol = urlConnection.getURL().getProtocol();
            response.query = urlConnection.getURL().getQuery();
            response.ref = urlConnection.getURL().getRef();
            response.userInfo = urlConnection.getURL().getUserInfo();
            response.contentLength = urlConnection.getContentLength();

            response.content = new String(temp.toString().getBytes());
            response.contentEncoding = encoding;
            response.code = urlConnection.getResponseCode();
            response.message = urlConnection.getResponseMessage();
            response.contentType = urlConnection.getContentType();
            response.method = urlConnection.getRequestMethod();
            response.connectTimeout = urlConnection.getConnectTimeout();
            response.readTimeout = urlConnection.getReadTimeout();

            return response;
        } catch (IOException e) {
            throw e;
        } finally {
            if (urlConnection != null){
                urlConnection.disconnect();
            }
        }
    }

    public static byte[] gunzip(byte[] bytes) {  
        if (bytes == null || bytes.length == 0) {  
            return null;  
        }  
        ByteArrayOutputStream out = new ByteArrayOutputStream();  
        ByteArrayInputStream in = new ByteArrayInputStream(bytes);  
        try {  
            GZIPInputStream ungzip = new GZIPInputStream(in);  
            byte[] buffer = new byte[256];  
            int n;  
            while ((n = ungzip.read(buffer)) >= 0) {  
                out.write(buffer, 0, n);  
            }  
        } catch (IOException e) {  
            System.err.println("gzip uncompress error.");
            e.printStackTrace();
        }  

        return out.toByteArray();  
    }

    /**
     * 得到默认的响应字符集
     */
    public String getDefaultContentEncoding() {
        return this.defaultContentEncoding;
    }

    /**
     * 设置默认的响应字符集
     */
    public void setDefaultContentEncoding(String defaultContentEncoding) {
        this.defaultContentEncoding = defaultContentEncoding;
    }

    public int getConnectTimeout() {
        return connectTimeout;
    }

    public void setConnectTimeout(int connectTimeout) {
        this.connectTimeout = connectTimeout;
    }

    public int getReadTimeout() {
        return readTimeout;
    }

    public void setReadTimeout(int readTimeout) {
        this.readTimeout = readTimeout;
    }
}

HttpResponse.java

package com.kuaidaili.sdk;

import java.util.Vector;

/**
 * HTTP响应对象
 */
public class HttpResponse {

    String urlString;
    int defaultPort;
    String file;
    String host;
    String path;
    int port;
    String protocol;
    String query;
    String ref;
    String userInfo;
    String contentEncoding;
    int contentLength;
    String content;
    String contentType;
    int code;
    String message;
    String method;

    int connectTimeout;

    int readTimeout;

    Vector<String> contentCollection;

    public String getContent() {
        return content;
    }

    public String getContentType() {
        return contentType;
    }

    public int getCode() {
        return code;
    }

    public String getMessage() {
        return message;
    }

    public Vector<String> getContentCollection() {
        return contentCollection;
    }

    public String getContentEncoding() {
        return contentEncoding;
    }

    public String getMethod() {
        return method;
    }

    public int getConnectTimeout() {
        return connectTimeout;
    }

    public int getReadTimeout() {
        return readTimeout;
    }

    public String getUrlString() {
        return urlString;
    }

    public int getDefaultPort() {
        return defaultPort;
    }

    public String getFile() {
        return file;
    }

    public String getHost() {
        return host;
    }

    public String getPath() {
        return path;
    }

    public int getPort() {
        return port;
    }

    public String getProtocol() {
        return protocol;
    }

    public String getQuery() {
        return query;
    }

    public String getRef() {
        return ref;
    }

    public String getUserInfo() {
        return userInfo;
    }

}

httpclient

HttpClient-4.5.6

使用提示

  1. 此样例同时支持访问http和https网页
  2. 建议使用白名单访问(HttpClient在使用用户名密码会出现一定数量的认证失败)
  3. 运行环境要求 jdk >= 1.6
  4. 依赖包(点击下载):
    httpclient-4.5.6.jar
    httpcore-4.4.10.jar
    commons-codec-1.10.jar
    commons-logging-1.2.jar
package com.kuaidaili.sdk;

import java.net.URL;

import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.CredentialsProvider;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

/**
 * 使用httpclient请求代理服务器
 * 请求http和https网页均适用
 */
public class TestProxyHttpClient {

    private static String pageUrl1 = "http://dev.kdlapi.com/testproxy"; //要访问的目标网页 http
    private static String pageUrl2 = "https://dev.kdlapi.com/testproxy"; //要访问的目标网页 https
    private static String tunnelHost = "mytunnelhost"; //隧道服务器域名
    private static int tunnelPort = mytunnelport; //隧道服务器端口
    private static String username = "myusername"; //隧道id
    private static String password = "mypassword"; //密码

    public static void main(String[] args) throws Exception {
        CredentialsProvider credsProvider = new BasicCredentialsProvider();
        credsProvider.setCredentials(
                new AuthScope(proxyIp, proxyPort),
                new UsernamePasswordCredentials(username, password));
        CloseableHttpClient httpclient = HttpClients.custom()
                .setDefaultCredentialsProvider(credsProvider).build();
        try {
            URL url = new URL(pageUrl1);
            HttpHost target = new HttpHost(url.getHost(), url.getDefaultPort(), url.getProtocol());
            HttpHost proxy = new HttpHost(proxyIp, proxyPort);

            RequestConfig config = RequestConfig.custom().setProxy(proxy).build();
            HttpGet httpget = new HttpGet(url.getPath());
            httpget.setConfig(config);
            httpget.addHeader("Accept-Encoding", "gzip"); //使用gzip压缩传输数据让访问更快

            System.out.println("Executing request " + httpget.getRequestLine() + " to " + target + " via " + proxy);

            CloseableHttpResponse response = httpclient.execute(target, httpget);
            try {
                System.out.println("----------------------------------------");
                System.out.println(response.getStatusLine());
                System.out.println(EntityUtils.toString(response.getEntity()));
            } finally {
                response.close();
            }
        } finally {
            httpclient.close();
        }

        try {
            URL url = new URL(pageUrl12);
            HttpHost target = new HttpHost(url.getHost(), url.getDefaultPort(), url.getProtocol());
            HttpHost proxy = new HttpHost(proxyIp, proxyPort);

            RequestConfig config = RequestConfig.custom().setProxy(proxy).build();
            HttpGet httpget = new HttpGet(url.getPath());
            httpget.setConfig(config);
            httpget.addHeader("Accept-Encoding", "gzip"); //使用gzip压缩传输数据让访问更快

            System.out.println("Executing request " + httpget.getRequestLine() + " to " + target + " via " + proxy);

            CloseableHttpResponse response = httpclient.execute(target, httpget);
            try {
                System.out.println("----------------------------------------");
                System.out.println(response.getStatusLine());
                System.out.println(EntityUtils.toString(response.getEntity()));
            } finally {
                response.close();
            }
        } finally {
            httpclient.close();
        }
    }
}

GoLang

标准库

标准库
// 请求代理服务器
// http和https网页均适用
// GO 版本为 GO1
package main

import (
    "compress/gzip"
    "fmt"
    "io"
    "io/ioutil"
    "net/http"
    "net/url"
    "os"
)

func main() {
    // 用户名密码(隧道id和密码)
    mytid := "mytid"
    password := "mypassword"

    // 隧道host和端口
    proxy_raw := "mytunnelhost:mytunnelport"
    proxy_str := fmt.Sprintf("http://%s:%s@%s", mytid, password, proxy_raw)
    proxy, err := url.Parse(proxy_str)

    // 目标网页
    page_url := "http://dev.kdlapi.com/testproxy"

    //  请求目标网页
    client := &http.Client{Transport: &http.Transport{Proxy: http.ProxyURL(proxy)}}
    req, _ := http.NewRequest("GET", page_url, nil)
    req.Header.Add("Accept-Encoding", "gzip") //使用gzip压缩传输数据让访问更快s
    res, err := client.Do(req)

    if err != nil {
        // 请求发生异常
        fmt.Println(err.Error())
    } else {
        defer res.Body.Close() //保证最后关闭Body

        fmt.Println("status code:", res.StatusCode) // 获取状态码

        // 有gzip压缩时,需要解压缩读取返回内容
        if res.Header.Get("Content-Encoding") == "gzip" {
            reader, _ := gzip.NewReader(res.Body) // gzip解压缩
            defer reader.Close()
            io.Copy(os.Stdout, reader)
            os.Exit(0) // 正常退出
        }

        // 无gzip压缩, 读取返回内容
        body, _ := ioutil.ReadAll(res.Body)
        fmt.Println(string(body))
    }
}

CSharp

标准库

标准库

使用提示

  • http和https网页均可适用
using System;
using System.Text;
using System.Net;
using System.IO;
using System.IO.Compression;

namespace csharp_http
{
    class Program
    {
        static void Main(string[] args)
        {
            // 要访问的目标网页
            string page_url = "http://dev.kdlapi.com/testproxy";

            // 构造请求
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(page_url);
            request.Method = "GET";
            request.Headers.Add("Accept-Encoding", "Gzip");  // 使用gzip压缩传输数据让访问更快

            // 隧道代理服务器ip/host
            string proxy_ip = "mytunnelhost";
            int proxy_port = mytunnelport;

            // 隧道id,密码 <隧道代理>
            string mytid = "mytid";
            string password = "mypassword";

            // 设置代理 <隧道代理已添加白名单>
            //request.Proxy = new WebProxy(proxy_ip, proxy_port);

            // 设置代理 <隧道代理未添加白名单>
            WebProxy proxy = new WebProxy();
            proxy.Address = new Uri(String.Format("http://{0}:{1}", proxy_ip, proxy_port));
            proxy.Credentials = new NetworkCredential(mytid, password);
            request.Proxy = proxy;

            // 请求目标网页
            HttpWebResponse response = (HttpWebResponse)request.GetResponse();

            Console.WriteLine((int)response.StatusCode);  // 获取状态码
            // 解压缩读取返回内容
            using (StreamReader reader = new StreamReader(new GZipStream(response.GetResponseStream(), CompressionMode.Decompress))) {
                Console.WriteLine(reader.ReadToEnd());
            }

        }
    }
}

Ruby

net/http

net/http(IP白名单)

使用提示

  • 基于ip白名单的http/https代理net/http
# -*- coding: utf-8 -*-
# ruby2.1.5



require 'net/http'  # 引入内置net/http模块
require 'zlib'
require 'stringio'

# 隧道代理host/ip 和 端口
proxy_ip = 'mytunnelhost'
proxy_port = mytunnelport


# 要访问的目标网页, 以快代理testproxy页面为例
page_url = "https://dev.kuaidaili.com/testproxy"
uri = URI(page_url)

# 新建代理实例
proxy = Net::HTTP::Proxy(proxy_ip, proxy_port)

# 创建新的请求对象 
req = Net::HTTP::Get.new(uri)
# 设置User-Agent
req['User-Agent'] = 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50'
req['Accept-Encoding'] = 'gzip'  # 使用gzip压缩传输数据让访问更快


# 使用代理发起请求, 若访问的是http网页, 请将use_ssl设为false
res = proxy.start(uri.hostname, uri.port, :use_ssl => true) do |http|
    http.request(req)
end

# 输出状态码
puts "status code: #{res.code}"

# 输出响应体
if  res.code.to_i != 200 then
    puts "page content: #{res.body}"
else
    gz = Zlib::GzipReader.new(StringIO.new(res.body.to_s))
    puts "page content: #{gz.read}"
end
net/http(用户名密码认证)

使用提示

  • 基于用户名密码认证的http/https代理net/http
# -*- coding: utf-8 -*-
# ruby2.1.5

require 'net/http'  # 引入内置net/http模块
require 'zlib'
require 'stringio'

# 隧道代理服务器host/ip 和 端口
proxy_ip = 'xx'
proxy_port = xx

# 隧道id,密码
mytid = 'yourmytid'
password = 'yourpassword'

# 要访问的目标网页, 以快代理testproxy页面为例
page_url = "https://dev.kuaidaili.com/testproxy"
uri = URI(page_url)

# 新建代理实例
proxy = Net::HTTP::Proxy(proxy_ip, proxy_port, mytid, password)

# 创建新的请求对象 
req = Net::HTTP::Get.new(uri)
# 设置代理用户名密码认证
req.basic_auth(mytid, password)
# 设置User-Agent
req['User-Agent'] = 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50'
req['Accept-Encoding'] = 'gzip'  # 使用gzip压缩传输数据让访问更快


# 使用代理发起请求, 若访问的是http网页, 请将use_ssl设为false
res = proxy.start(uri.hostname, uri.port, :use_ssl => true) do |http|
    http.request(req)
end

# 输出状态码
puts "status code: #{res.code}"

# 输出响应体
if  res.code.to_i != 200 then
    puts "page content: #{res.body}"
else
    gz = Zlib::GzipReader.new(StringIO.new(res.body.to_s))
    puts "page content: #{gz.read}" 
end

httparty

httparty(IP白名单)

使用提示

  • 基于IP白名单认证的http/https代理httparty
# ruby2.1.5
require "httparty"  # 引入httparty模块
require 'zlib'
require 'stringio'


# 隧道代理host和端口
proxy_ip = 'mytunnelhost'
proxy_port = mytunnelport


# 要访问的目标网页, 以京东首页为例
page_url = 'https://dev.kuaidaili.com/testproxy'

# 设置headers
headers = {
    "User-Agent" => "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50",
    "Accept-Encoding" => "gzip",
}

# 设置代理
options = {
    :headers => headers, 
    :http_proxyaddr => proxy_ip, 
    :http_proxyport => proxy_port,
}

# 发起请求
res = HTTParty.get(page_url, options)

# 输出状态码
puts "status code: #{res.code}"

# 输出响应体
if  res.code.to_i != 200 then
    puts "page content: #{res.body}"
else
    gz = Zlib::GzipReader.new(StringIO.new(res.body.to_s))
    puts "page content: #{gz.read}" 
end
httparty(用户名密码认证)

使用提示

  • 基于用户名密码认证的http/https代理httparty
# ruby2.1.5
require "httparty"  # 引入httparty模块
require 'zlib'
require 'stringio'

# 隧道代理host和端口
proxy_ip = 'xxx'
proxy_port = xx

# 用户名密码
username = 'yourusername'
password = 'yourpassword'

# 要访问的目标网页, 以京东首页为例
page_url = 'https://dev.kuaidaili.com/testproxy'

# 设置headers
headers = {
    "User-Agent" => "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50",
    "Accept-Encoding" => "gzip",
}

# 设置代理
options = {
    :headers => headers, 
    :http_proxyaddr => proxy_ip, 
    :http_proxyport => proxy_port, 
    :http_proxyuser => username, 
    :http_proxypass => password,
}

# 发起请求
res = HTTParty.get(page_url, options)

# 输出状态码
puts "status code: #{res.code}"

# 输出响应体
if  res.code.to_i != 200 then
    puts "page content: #{res.body}"
else
    gz = Zlib::GzipReader.new(StringIO.new(res.body.to_s))
    puts "page content: #{gz.read}" 
end

Node.js

标准库(适用http和https)

标准库(适用http和https请求)

使用提示

  • http网页和https网页均可适用
let http = require('http'); // 引入内置http模块
let tls = require('tls'); // 引入内置tls模块
let util = require('util');

// 隧道id及密码, 若已添加白名单则不需要添加
const mytid = 'mytid';
const password = 'mypassword';
const auth = 'Basic ' + new Buffer(mytid + ':' + password).toString('base64');

// 隧道代理host/ip和端口
let proxy_ip = 'mytunnelhost';
let proxy_port = mytunnelport;

// 要访问的主机和路径, 以京东首页为例
let remote_host = 'www.jd.com';
let remote_path = '/';

// 发起CONNECT请求
let req = http.request({
    host: proxy_ip,
    port: proxy_port,
    method: 'CONNECT',
    path: util.format('%s:443', remote_host),
    headers: {
        "Host": remote_host,
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3100.0 Safari/537.36",
        "Proxy-Authorization": auth,
        "Accept-Encoding": "gzip"   // 使用gzip压缩让数据传输更快
    }
});


req.on('connect', function (res, socket, head) {
    // TLS握手
    let tlsConnection = tls.connect({
        host: remote_host,
        socket: socket
    }, function () {
        // 发起GET请求
        tlsConnection.write(util.format('GET %s HTTP/1.1\r\nHost: %s\r\n\r\n', remote_path, remote_host));
    });

    tlsConnection.on('data', function (data) {
        // 输出响应结果(完整的响应报文串)
        console.log(data.toString());
    });
});

req.end();

request

request(推荐使用)

使用提示

  • 请先安装request库: npm install request
  • http网页和https网页均可适用
let request = require('request'); // 引入第三方request库
let util = require('util');
let zlib = require('zlib');

// 隧道id和密码, 若已添加白名单则不需要添加
const mytid = 'mytid';
const password = 'mypassword';

// 要访问的目标地址
let page_url = 'https://www.jd.com'

// 隧道代理服务器host/ip和端口
let proxy_ip = 'mytunnelhost';
let proxy_port = mytunnelport;

// 完整代理服务器url
let proxy = util.format('http://%s:%s@%s:%d', mytid, password, proxy_ip, proxy_port);

// 发起请求
request({
    url: page_url,
    method: 'GET',
    proxy: proxy,
    headers: {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3100.0 Safari/537.36",
        "Accept-Encoding": "gzip"   // 使用gzip压缩让数据传输更快
    },
    encoding: null,  // 方便解压缩返回的数据
}, function(error, res, body) {
    if (!error && res.statusCode == 200) {
        // 输出返回内容(使用了gzip压缩)
        if (res.headers['content-encoding'] && res.headers['content-encoding'].indexOf('gzip') != -1) {
            zlib.gunzip(body, function(err, dezipped) {
                console.log(dezipped.toString());
            });
        } else {
            // 输出返回内容(没有使用gzip压缩)
            console.log(body);
        }
    } else {
        console.log(error);
    }
});

puppeteer

puppeteer(IP白名单)

使用提示

  • 基于白名单的http/https代理Puppeteer
  • 运行环境要求: node7.6.0或以上 + puppeteer
  • 请先安装puppeteer: npm i puppeteer
// 引入puppeteer模块
const puppeteer = require('puppeteer');

// 要访问的目标网页
const url = 'http://dev.kuaidaili.com/testproxy';

// 添加headers
const headers = {
    'Accept-Encoding': 'gzip' // 使用gzip压缩让数据传输更快
};


// 隧道代理服务器host/ip和端口
let proxy_ip = 'mytunnelhost';
let proxy_port = mytunnelport;

(async ()=> {
    // 新建一个浏览器实例
    const browser = await puppeteer.launch({
        headless: false,  // 是否不显示窗口, 默认为true, 设为false便于调试
        args: [
            `--proxy-server=${proxy_ip}:${proxy_port}`,
            '--no-sandbox',
            '--disable-setuid-sandbox'
        ]
    });

    // 打开一个新页面
    const page = await browser.newPage();

    // 设置headers
    await page.setExtraHTTPHeaders(headers);

    // 访问目标网页
    await page.goto(url);

})();
puppeteer(用户名密码认证)

使用提示

  • 基于用户名密码认证的http/https代理Puppeteer
  • 运行环境要求: node7.6.0或以上 + puppeteer
  • 请先安装puppeteer: npm i puppeteer
// 引入puppeteer模块
const puppeteer = require('puppeteer');

// 要访问的目标网页
const url = 'http://dev.kuaidaili.com/testproxy';

// 添加headers
const headers = {
    'Accept-Encoding': 'gzip'
};

// 隧道代理服务器host/ip和端口
let proxy_ip = 'xxx';
let proxy_port = xxx;


// 隧道id,密码 (可到会员中心查看)
const mytid = 'youmytid';
const password = 'yourpassword';

(async ()=> {
    // 新建一个浏览器实例
    const browser = await puppeteer.launch({
        headless: false,  // 是否不显示窗口, 默认为true, 设为false便于调试
        args: [
            `--proxy-server=${proxy_ip}:${proxy_port}`,
            '--no-sandbox',
            '--disable-setuid-sandbox'
        ]
    });

    // 打开一个新页面
    const page = await browser.newPage();

    // 设置headers
    await page.setExtraHTTPHeaders(headers);

    // 用户名密码认证
    await page.authenticate({mytid: mytid, password: password});

    // 访问目标网页
    await page.goto(url);
})();

php

curl

curl

使用提示

  1. 此样例同时支持访问http和https网页
  2. curl不是php原生库,需要安装才能使用:
    Ubuntu/Debian系统:apt-get install php5-curl
    CentOS系统:yum install php-curl
<?php
//要访问的目标页面
$page_url = "http://dev.kdlapi.com/testproxy";

//隧道代理服务器host+端口
$proxy = "mytunnelhost:mytunnelport";

//隧道id和密码(隧道代理)
$mytid   = "mytid";
$password   = "mypassword";

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $page_url);

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);  
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);

//设置代理
curl_setopt($ch, CURLOPT_PROXYTYPE, CURLPROXY_HTTP);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
//设置代理用户名密码(隧道代理)
curl_setopt($ch, CURLOPT_PROXYAUTH, CURLAUTH_BASIC);
curl_setopt($ch, CURLOPT_PROXYUSERPWD, "{$mytid}:{$password}");

//自定义header
$headers = array();
$headers[] = 'User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0);';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

//自定义cookie
curl_setopt($ch, CURLOPT_COOKIE,''); 

curl_setopt($ch, CURLOPT_ENCODING, 'gzip'); //使用gzip压缩传输数据让访问更快

curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);

curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$result = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);

echo $result;
echo "\n\nfetch ".$info['url']."\ntimeuse: ".$info['total_time']."s\n\n";
?>