开发Web应用程序时,无论是用什么样的框架技术来开发,一碰从数据库存取涉及到中文的数据,就要面对中文乱码或者是各种编码方式不匹配的异常,今天晚上终于搞定了Tomcat+MySql+Struts的中文问题,用到了很简单的方法,很快就能搞定。
在做以下工作之前,所有的HTML/JSP的charset都设为charset=gb2312。
第一个要解决的是表单提交乱码问题。在使用Struts提供的ActionForm过程中,无论表单采用的是Struts标签还是Html标签,都可以用ActionForm的Get/Set来获取和设置表单的元素值(它们的作用效果与request.getParameter()方法一样),但提取出来的数据不经过处理的话就是乱码,主要的原因是1.Tomcat的J2EE实现对表单提交即Post方法提交时,处理参数采用默认的ISO8859_1来处理2.Tomcat对Get方法提交的请求在query-string处理时采用了和Post方法不一样的处理方式。所以如果要正确地显示和获取中文数据采用的解决方案:(1)对于Post方法提交的表单通过编写一个过滤器(filer)的方法解决,过滤器在用户提交的数据被处理之前被调用,可以通过这个Java代码改变参数的编码方式(目标编码方式可以通过Web.xml文件里面的参数指定)。过滤器的代码如下:
import java.io.IOException;
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.UnavailableException;
/**
* <p>Example filter that sets the character encoding to be used in parsing the
* incoming request, either unconditionally or only if the client did not
* specify a character encoding. Configuration of this filter is based on
* the following initialization parameters:</p>
* <ul>
* <li><strong>encoding</strong> - The character encoding to be configured
* for this request, either conditionally or unconditionally based on
* the <code>ignore</code> initialization parameter. This parameter
* is required, so there is no default.</li>
* <li><strong>ignore</strong> - If set to "true", any character encoding
* specified by the client is ignored, and the value returned by the
* <code>selectEncoding()</code> method is set. If set to "false,
* <code>selectEncoding()</code> is called <strong>only</strong> if the
* client has not already specified an encoding. By default, this
* parameter is set to "true".</li>
* </ul>
*
* <p>Although this filter can be used unchanged, it is also easy to
* subclass it and make the <code>selectEncoding()</code> method more
* intelligent about what encoding to choose, based on characteristics of
* the incoming request (such as the values of the <code>Accept-Language</code>
* and <code>User-Agent</code> headers, or a value stashed in the current
* user's session.</p>
*
* @author Craig McClanahan
* @version $Revision: 1.2 $ $Date: 2004/03/18 16:40:33 $
*/
public class SetCharacterEncodingFilter implements Filter {
// ----------------------------------------------------- Instance Variables
/**
* The default character encoding to set for requests that pass through
* this filter.
*/
protected String encoding = null;
/**
* The filter configuration object we are associated with. If this value
* is null, this filter instance is not currently configured.
*/
protected FilterConfig filterConfig = null;
/**
* Should a character encoding specified by the client be ignored?
*/
protected boolean ignore = true;
// --------------------------------------------------------- Public Methods
/**
* Take this filter out of service.
*/
public void destroy() {
this.encoding = null;
this.filterConfig = null;
}
/**
* Select and set (if specified) the character encoding to be used to
* interpret request parameters for this request.
*
* @param request The servlet request we are processing
* @param result The servlet response we are creating
* @param chain The filter chain we are processing
*
* @exception IOException if an input/output error occurs
* @exception ServletException if a servlet error occurs
*/
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain)
throws IOException, ServletException {
// Conditionally select and set the character encoding to be used
if (ignore || (request.getCharacterEncoding() == null)) {
String encoding = selectEncoding(request);
if (encoding != null)
request.setCharacterEncoding(encoding);
}
// Pass control on to the next filter
chain.doFilter(request, response);
}
/**
* Place this filter into service.
*
* @param filterConfig The filter configuration object
*/
public void init(FilterConfig filterConfig) throws ServletException {
this.filterConfig = filterConfig;
this.encoding = filterConfig.getInitParameter("encoding");
String value = filterConfig.getInitParameter("ignore");
if (value == null)
this.ignore = true;
else if (value.equalsIgnoreCase("true"))
this.ignore = true;
else if (value.equalsIgnoreCase("yes"))
this.ignore = true;
else
this.ignore = false;
}
// ------------------------------------------------------ Protected Methods
/**
* Select an appropriate character encoding to be used, based on the
* characteristics of the current request and/or filter initialization
* parameters. If no character encoding should be set, return
* <code>null</code>.
* <p>
* The default implementation unconditionally returns the value configured
* by the <strong>encoding</strong> initialization parameter for this
* filter.
*
* @param request The servlet request we are processing
*/
protected String selectEncoding(ServletRequest request) {
return (this.encoding);
}
}
编绎后把class文件放在classes目录下,并在Web应用的web.xml文件中添加如下代码:
<filter>
<filter-name>Set Character Encoding</filter-name>
<filter-class>com.neusoft.equipment.controller.SetCharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>gbk</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>Set Character Encoding</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>只要是gb2312,gbk,utf8等支持多字节编码的字符集都可以储存汉字,当然,gb2312中的汉字数量远少于gbk,而gb2312,gbk等都可在utf8下编码,这里指定目标编码方式是gbk,重新启动Tomcat后就可以了。
(2)对Get方法提交的表单,由于参数是紧跟在用户的URL请求后面,Tomcat对其的处理方法与Post方法不一样。所以上面设置的过滤器对Get方法没有作用,它需要在其他地方设置。找到Tomca的server.xml配置文件,找到对80(或者是8080等别的,这个是自己修改后的)的Connector组件的设置部分,给这个Connector组件添加一个属性:URIEncoding="GBK"。修改后的Connector组件是这样的:
<!-- Define a non-SSL HTTP/1.1 Connector on port 80-->
<Connector port="80" maxHttpHeaderSize="8192"
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="100"
connectionTimeout="20000" disableUploadTimeout="true" URIEncoding="GBK"/>这样修改后,重启Tomcat就可以正确处理GET方法提交的表单数据了。
第二个要解决的是数据库存取数据出现的乱码等情况。对于不同的数据库往往支持不同的编码,造成了应用时比较混乱,不同的数据库的解决方法往往是不同的,针对MySql,网上也有各种各样的解决方案,但个人觉得那些太繁了,现在有一个极其简单的解决办法:修改MySql的配置文件,打开MySql安装后的根目录,找到my.init文件,把[mysqld]区的如下语句:default-character-set=latin1修改为:default-character-set=gbk,然后在[client]区增加:default-character-set=gbk,修改后记得做一件事情,到Widows控制面板的管理工具下的服务程序,把Mysql服务停止了重新启动,这样就根本解决了MySql的数据库乱码问题,很简单~~~~