Skip to content

Commit 97ca441

Browse files
committedAug 18, 2015
如何使用java.net.URLConnection接收及发送HTTP请求
1 parent 7667394 commit 97ca441

File tree

2 files changed

+219
-4
lines changed

2 files changed

+219
-4
lines changed
 

‎README.md

+5-4
Original file line numberDiff line numberDiff line change
@@ -21,14 +21,14 @@ stackoverflow-Java-top-qa
2121

2222
> 网络
2323
24-
*
25-
24+
* [如何使用java.net.URLConnection接收及发送HTTP请求](https://github.com/giantray/stackoverflow-java-top-qa/blob/master/contents/using-java-net-urlconnection-to-fire-and-handle-http-requests.md)
25+
2626
> 性能
2727
2828
* [LinkedList、ArrayList各自的使用场景,如何确认应该用哪一个呢?](https://github.com/giantray/stackoverflow-java-top-qa/blob/master/contents/when-to-use-linkedlist-over-arraylist.md)
2929

3030

31-
### 待翻译问题链接(还剩0问题)
31+
### 待翻译问题链接(还剩x问题)
3232
- [Why is processing a sorted array faster than an unsorted array?](http://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-an-unsorted-array)
3333
- [Why is subtracting these two times (in 1927) giving a strange result?](http://stackoverflow.com/questions/6841333/why-is-subtracting-these-two-times-in-1927-giving-a-strange-result)
3434
- [Proper use cases for Android UserManager.isUserAGoat()?](http://stackoverflow.com/questions/13375357/proper-use-cases-for-android-usermanager-isuseragoat)
@@ -41,7 +41,8 @@ stackoverflow-Java-top-qa
4141
- [Converting String to int in Java?](http://stackoverflow.com/questions/5585779/converting-string-to-int-in-java)
4242
- [Is there a unique Android device ID?](http://stackoverflow.com/questions/2785485/is-there-a-unique-android-device-id)
4343
- [How to test a class that has private methods, fields or inner classes](http://stackoverflow.com/questions/34571/how-to-test-a-class-that-has-private-methods-fields-or-inner-classes)
44-
- [Using java.net.URLConnection to fire and handle HTTP requests](http://stackoverflow.com/questions/2793150/using-java-net-urlconnection-to-fire-and-handle-http-requests)
44+
-
45+
4546
- [Why does this code using random strings print “hello world”?](http://stackoverflow.com/questions/15182496/why-does-this-code-using-random-strings-print-hello-world)
4647
- [Iterate over each Entry in a Map](http://stackoverflow.com/questions/46898/iterate-over-each-entry-in-a-map)
4748
- [How can I create an executable jar with dependencies using Maven?](http://stackoverflow.com/questions/574594/how-can-i-create-an-executable-jar-with-dependencies-using-maven)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
##如何使用java.net.URLConnection接收及发送HTTP请求
2+
3+
首先声明,下面的代码,都是基本的例子。更严谨的话,还应加入处理各种异常的代码(如IOExceptions、NullPointerException、ArrayIndexOutOfBoundsException)
4+
5+
###准备
6+
首先,需要设置请求的URL以及charset(编码);另外还需要哪些参数,则取决于各自url的要求。
7+
```java
8+
String url = "http://example.com";
9+
String charset = "UTF-8";
10+
String param1 = "value1";
11+
String param2 = "value2";
12+
// ...
13+
String query = String.format("param1=%s¶m2=%s",
14+
URLEncoder.encode(param1, charset),
15+
URLEncoder.encode(param2, charset));
16+
```
17+
请求参数必须是name=value这样的格式,每个参数间用&连接。一般来说,你还得用 [URLEncoder#encode()](http://docs.oracle.com/javase/6/docs/api/java/net/URLEncoder.html)对参数做[编码](http://en.wikipedia.org/wiki/Percent-encoding)
18+
上面例子还用到了String#format(),这只是为了方便,我更喜欢用这个方式来完成string的拼接。
19+
20+
###发送一个[HTTP GET](http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.3)请求(可选:带上参数)
21+
这依然是个繁琐的事情。默认的方式如下:
22+
```java
23+
URLConnection connection = new URL(url + "?" + query).openConnection();
24+
connection.setRequestProperty("Accept-Charset", charset);
25+
InputStream response = connection.getInputStream();
26+
```
27+
url和参数之间,要用?号连接。请求头(header)中的[Accept-Charset](http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.2),用于告诉服务器,你所发送参数的编码。如果你不发送任何参数,也可以不管Accept-Charset。如果你无需设置任何header,也可以用[URL#openStream()](http://docs.oracle.com/javase/6/docs/api/java/net/URL.html#openStream%28%29) 而非openConnection。
28+
不管那种方式,假设服务器端是 [HttpServlet](http://docs.oracle.com/javaee/6/api/javax/servlet/http/HttpServlet.html),那么你的get请求将会触发它的doGet()方法,它能通过[HttpServletRequest#getParameter()](http://docs.oracle.com/javaee/6/api/javax/servlet/ServletRequest.html#getParameter%28java.lang.String%29)获取你传递的参数。
29+
30+
###发送一个[HTTP POST](http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5)请求,并带上参数
31+
设置[URLConnection#setDoOutput()](http://docs.oracle.com/javase/6/docs/api/java/net/URLConnection.html#setDoOutput%28boolean%29),等于隐式地将请求方法设为POST。标准的HTTP POST 表单,其Content-Tyep为application/x-www-form-urlencoded,请求的内容放到到body中。也就是如下代码:
32+
```java
33+
URLConnection connection = new URL(url).openConnection();
34+
connection.setDoOutput(true); // Triggers POST.
35+
connection.setRequestProperty("Accept-Charset", charset);
36+
connection.setRequestProperty("Content-Type", "application/x-www-form-urlencoded;charset=" + charset);
37+
38+
try (OutputStream output = connection.getOutputStream()) {
39+
output.write(query.getBytes(charset));
40+
}
41+
42+
InputStream response = connection.getInputStream();
43+
```
44+
45+
提醒:
46+
当你要提交一个HTML表单时,务必要把<input type="hidden"这类元素的值,以name=value的形式也一并提交。另外,还有<input type="submit">这类元素,也是如此。因为,通常服务端也需要这个信息,来确认哪一个按钮触发了这个提交动作。
47+
48+
也可以使用[HttpURLConnection](http://docs.oracle.com/javase/6/docs/api/java/net/HttpURLConnection.html) 来代替[URLConnection](http://docs.oracle.com/javase/6/docs/api/java/net/URLConnection.html) ,然后调用[HttpURLConnection#setRequestMethod()](http://docs.oracle.com/javase/6/docs/api/java/net/HttpURLConnection.html#setRequestMethod%28java.lang.String%29)来将请求设为POST类型。
49+
```java
50+
HttpURLConnection httpConnection = (HttpURLConnection) new URL(url).openConnection();
51+
httpConnection.setRequestMethod("POST");
52+
```
53+
同样的,如果服务端是[HttpServlet](http://docs.oracle.com/javaee/6/api/javax/servlet/http/HttpServlet.html),将会触发它的[doPost()](http://docs.oracle.com/javaee/6/api/javax/servlet/http/HttpServlet.html#doPost%28javax.servlet.http.HttpServletRequest,%20javax.servlet.http.HttpServletResponse%29)方法,可以通过[HttpServletRequest#getParameter()](http://docs.oracle.com/javaee/6/api/javax/servlet/ServletRequest.html#getParameter%28java.lang.String%29)获取post参数
54+
55+
###真正触发HTTP请求的发送
56+
你可以显式地通过[URLConnection#connect()](http://docs.oracle.com/javase/6/docs/api/java/net/URLConnection.html#connect%28%29)来发送请求,但是,当你调用获取响应信息的方法时,一样将自动发送请求。例如当你使用[URLConnection#getInputStream()](http://docs.oracle.com/javase/6/docs/api/java/net/URLConnection.html#getInputStream%28%29)时,就会自动触发请求,因此,不用多次一举地调用connect()方法。上面我的例子,也都是直接调用getInputStream()方法。
57+
58+
获取HTTP响应信息
59+
1. HTTP响应码:
60+
首先默认你使用了 [HttpURLConnection](http://docs.oracle.com/javase/6/docs/api/java/net/HttpURLConnection.html)
61+
```java
62+
int status = httpConnection.getResponseCode();
63+
```
64+
2. HTTP 响应头(headers)
65+
```java
66+
for (Entry<String, List<String>> header : connection.getHeaderFields().entrySet()) {
67+
System.out.println(header.getKey() + "=" + header.getValue());
68+
}
69+
```
70+
3. HTTP响应编码:
71+
当Content-Type中包含charset参数时,说明响应内容是基于charset参数指定的编码。因此,解码响应信息时,也要按照这个编码格式来。
72+
73+
```java
74+
String contentType = connection.getHeaderField("Content-Type");
75+
String charset = null;
76+
77+
for (String param : contentType.replace(" ", "").split(";")) {
78+
if (param.startsWith("charset=")) {
79+
charset = param.split("=", 2)[1];
80+
break;
81+
}
82+
}
83+
84+
if (charset != null) {
85+
try (BufferedReader reader = new BufferedReader(new InputStreamReader(response, charset))) {
86+
for (String line; (line = reader.readLine()) != null;) {
87+
// ... System.out.println(line) ?
88+
}
89+
}
90+
}
91+
else {
92+
// It's likely binary content, use InputStream/OutputStream.
93+
}
94+
```
95+
96+
97+
###session的维护
98+
服务端session,通常是基于cookie实现的。你可以通过[CookieHandlerAPI](http://docs.oracle.com/javase/8/docs/api/java/net/CookieHandler.html)来管理cookie。在发送HTTP请求前,初始化一个[CookieManager](http://docs.oracle.com/javase/6/docs/api/java/net/CookieManager.html), 然后设置参数为[CookiePolicy](http://docs.oracle.com/javase/6/docs/api/java/net/CookiePolicy.html).[CCEPT_ALL](http://docs.oracle.com/javase/6/docs/api/java/net/CookiePolicy.html#ACCEPT_ALL)
99+
```java
100+
// First set the default cookie manager.
101+
CookieHandler.setDefault(new CookieManager(null, CookiePolicy.ACCEPT_ALL));
102+
// All the following subsequent URLConnections will use the same cookie manager.
103+
URLConnection connection = new URL(url).openConnection();
104+
// ...
105+
connection = new URL(url).openConnection();
106+
// ...
107+
connection = new URL(url).openConnection();
108+
// ...
109+
```
110+
111+
请注意,这个方式并非适用于所有场景。如果使用这个方式失败了,你可以尝试自己设置cookie:你需要从响应头中拿到Set-Cookie参数,然后再把cookie设置到接下来的其他请求中。
112+
```java
113+
// Gather all cookies on the first request.
114+
URLConnection connection = new URL(url).openConnection();
115+
List<String> cookies = connection.getHeaderFields().get("Set-Cookie");
116+
// ...
117+
118+
// Then use the same cookies on all subsequent requests.
119+
connection = new URL(url).openConnection();
120+
for (String cookie : cookies) {
121+
connection.addRequestProperty("Cookie", cookie.split(";", 2)[0]);
122+
}
123+
// ...
124+
```
125+
上面的split(";", 2)[0],作用是去掉一些跟服务端无关的cookie信息(例如expores,path等)。也可用cookie.substring(0, cookie.indexOf(';'))实现同样的目的
126+
127+
###流的处理
128+
不管你是否通过connection.setRequestProperty("Content-Length", contentLength)为content设置了定长, [HttpURLConnection](http://docs.oracle.com/javase/6/docs/api/java/net/HttpURLConnection.html)在发送请求前,默认都会缓存整个请求的body。如果发送一个比较大的post请求(例如上传文件),有可能会导致OutOfMemoryException。为了避免这个问题,可以设置[HttpURLConnection#setFixedLengthStreamingMode()](http://docs.oracle.com/javase/6/docs/api/java/net/HttpURLConnection.html#setFixedLengthStreamingMode%28int%29)
129+
httpConnection.setFixedLengthStreamingMode(contentLength);
130+
但如果content长度是未知的,则可以用[HttpURLConnection#setChunkedStreamingMode()](http://docs.oracle.com/javase/6/docs/api/java/net/HttpURLConnection.html#setChunkedStreamingMode%28int%29)。这样,header中Transfer-Encoding会变成chunked,你的请求将会分块发送,例如下面的例子,请求的body,将会按1KB一块,分块发送
131+
```java
132+
httpConnection.setChunkedStreamingMode(1024);
133+
```
134+
135+
###User-Agent
136+
有时候,你发送的请求,可能只有在浏览器下才能正常返回,而其他方式却不行。这可能跟请求头中的User-Agent有关。通过URLConnection发送的请求,默认会带上的User-Agent信息是Java/1.6.0_19,也就是java+jre的版本。你可以重写这个信息:
137+
```java
138+
connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401"); // Do as if you're using Firefox 3.6.3.
139+
```
140+
这里有一份更全的浏览器[User-Agent清单](http://www.useragentstring.com/pages/useragentstring.php)
141+
142+
###错误处理
143+
如果HTTP的响应码是4xx(客户端异常)或者5xx(服务端异常),你可以通过HttpURLConnection#getErrorStream()获取信息,服务端可能会将一些有用的错误信息放到这里面。
144+
```java
145+
InputStream error = ((HttpURLConnection) connection).getErrorStream();
146+
```
147+
148+
###上传文件
149+
一般来说,你需要将post的内容设为[multipart/form-data](http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2)(相关的RFC文档:[RFC2388](http://www.faqs.org/rfcs/rfc2388.html))
150+
```java
151+
String param = "value";
152+
File textFile = new File("/path/to/file.txt");
153+
File binaryFile = new File("/path/to/file.bin");
154+
String boundary = Long.toHexString(System.currentTimeMillis()); // Just generate some unique random value.
155+
String CRLF = "\r\n"; // Line separator required by multipart/form-data.
156+
URLConnection connection = new URL(url).openConnection();
157+
connection.setDoOutput(true);
158+
connection.setRequestProperty("Content-Type", "multipart/form-data; boundary=" + boundary);
159+
160+
try (
161+
OutputStream output = connection.getOutputStream();
162+
PrintWriter writer = new PrintWriter(new OutputStreamWriter(output, charset), true);
163+
) {
164+
// Send normal param.
165+
writer.append("--" + boundary).append(CRLF);
166+
writer.append("Content-Disposition: form-data; name=\"param\"").append(CRLF);
167+
writer.append("Content-Type: text/plain; charset=" + charset).append(CRLF);
168+
writer.append(CRLF).append(param).append(CRLF).flush();
169+
170+
// Send text file.
171+
writer.append("--" + boundary).append(CRLF);
172+
writer.append("Content-Disposition: form-data; name=\"textFile\"; filename=\"" + textFile.getName() + "\"").append(CRLF);
173+
writer.append("Content-Type: text/plain; charset=" + charset).append(CRLF); // Text file itself must be saved in this charset!
174+
writer.append(CRLF).flush();
175+
Files.copy(textFile.toPath(), output);
176+
output.flush(); // Important before continuing with writer!
177+
writer.append(CRLF).flush(); // CRLF is important! It indicates end of boundary.
178+
179+
// Send binary file.
180+
writer.append("--" + boundary).append(CRLF);
181+
writer.append("Content-Disposition: form-data; name=\"binaryFile\"; filename=\"" + binaryFile.getName() + "\"").append(CRLF);
182+
writer.append("Content-Type: " + URLConnection.guessContentTypeFromName(binaryFile.getName())).append(CRLF);
183+
writer.append("Content-Transfer-Encoding: binary").append(CRLF);
184+
writer.append(CRLF).flush();
185+
Files.copy(binaryFile.toPath(), output);
186+
output.flush(); // Important before continuing with writer!
187+
writer.append(CRLF).flush(); // CRLF is important! It indicates end of boundary.
188+
189+
// End of multipart/form-data.
190+
writer.append("--" + boundary + "--").append(CRLF).flush();
191+
}
192+
```
193+
194+
假设服务端还是一个[HttpServlet](http://docs.oracle.com/javaee/6/api/javax/servlet/http/HttpServlet.html),它的doPost()方法将会处理这个请求,服务端通过[HttpServletRequest#getPart()](http://docs.oracle.com/javaee/6/api/javax/servlet/http/HttpServletRequest.html#getPart%28java.lang.String%29)获取你发送的内容(注意了,不是getParameter())。getPart()是个比较新的方法,是在Servlet 3.0后才引入的。如果你是Servlet 3.0之前的版本,则可以选用[Apache Commons FileUpload](http://commons.apache.org/fileupload]来解析multipart/form-data的请求。可以参考这里的[例子](http://stackoverflow.com/questions/2422468/upload-big-file-to-servlet/2424824#2424824)
195+
196+
###最后的话
197+
上面啰嗦了很多,Apache提供了工具包,帮助我们更方便地完成这些事情
198+
[Apache HttpComponents HttpClient](http://stackoverflow.com/questions/2422468/upload-big-file-to-servlet/2424824#2424824)
199+
- [HttpClient Tutorial](http://hc.apache.org/httpcomponents-client-ga/tutorial/html/)
200+
- [HttpClient Examples](http://hc.apache.org/httpcomponents-client-ga/examples.html)
201+
202+
203+
google也有类似的[工具包](https://code.google.com/p/google-http-java-client/)
204+
205+
解析、提取HTML内容
206+
如果你是想解析提取html的内容,你可以用[Jsoup](http://jsoup.org/)等解析器
207+
- [一些比较有名的HTML解析器的优缺点](http://stackoverflow.com/questions/3152138/what-are-the-pros-and-cons-of-the-leading-java-html-parsers/3154281#3154281)
208+
- [用java如何扫描和解析网页](http://stackoverflow.com/questions/2835505/how-to-scan-a-website-or-page-for-info-and-bring-it-into-my-program/2835555#2835555)
209+
210+
211+
212+
213+
stackoverflow原址:
214+
http://stackoverflow.com/questions/2793150/using-java-net-urlconnection-to-fire-and-handle-http-requests

0 commit comments

Comments
 (0)
Please sign in to comment.