需求为:到某一网站抓取查询结果.环境为vb.net
从0开始,一开始具体需要用到.net里的具体什么东东都不清楚,于是就一顿瞎搜索.又是google,又是baidu,yisou......胡乱搜的内容有.net ie,拆分网页 .net,内嵌ie等等.没过多久能得知webbrowser这个控件.
其中对我有帮助比较大的文章是http://www.microsoft.com/china/msdn/Archives/workshop/scrape.asp
只是这里介绍的vb环境.到.net也没什么太大差别,别笑!我最开始找shdocvw.dll 和 mshtml.dll添加引用时候都费了半天劲.因为大家都说webbrowser.而.net里写的是microsoft web 浏览器..
先按照上面的文章练一练!
不说废话了.
先做一个输入框,和一个按钮,供输入信息,和提交信息.
在按纽的click事件中写:
Dim postdata As String() = {'searchText=' + Me.searchText.Text}
Dim strUrl As String = 'http://'
Dim SessionHtml As String = PostDate(strUrl, postdata)
'产生临时文件
Dim sw As StreamWriter = New StreamWriter('D:\1.htm', False, Encoding.GetEncoding('GB2312'))
sw.WriteLine(SessionHtml)
sw.Close()
Me.AxWebBrowserFill.Navigate('D:\1.htm')
PostDate函数如下:
Public Function PostDate(ByVal url As String, ByVal PostData() As String) As String
Dim Post As String = ''
'拼接成传递变量
For Each s As String In PostData
Post += s + '&'
Next
Post = Post.Substring(0, Post.Length - 1)
Dim html As String = ''
Dim encoding As Encoding = encoding.GetEncoding('GB2312')
Dim data As Byte() = encoding.GetBytes(Post)
Dim myRequest As HttpWebRequest = CType(WebRequest.Create(url), HttpWebRequest)
myRequest.Method = 'post'
myRequest.ContentType = 'application/x-www-form-urlencoded'
'myRequest.ContentType = 'text/asp'
myRequest.ContentLength = data.Length
Dim newStream As Stream = myRequest.GetRequestStream()
newStream.Write(data, 0, data.Length)
newStream.Close()
Dim resp As HttpWebResponse = CType(myRequest.GetResponse(), HttpWebResponse)
Dim sr As StreamReader = New StreamReader(resp.GetResponseStream(), System.Text.Encoding.GetEncoding('GB2312'))
'返回html代码的字符串
html = sr.ReadToEnd()
sr.Close()
Return html
End Function
这样就可以了.
至于直接把html显示在webbrowser控件中,而不通过临时文件,在网上搜到的都是delphi办法.而.net似乎没有完美的解决办法.
曾经试过:
'AxWebBrowserFill.Navigate(SessionHtml)
'Me.AxWebBrowserFill.Document.write(SessionHtml + 'haga')
'Me.axScriptLet.url = 'about:blank' + SessionHtml
'Me.AxWebBrowserFill.Document.write(SessionHtml)
'doc = Me.AxWebBrowserFill.Document
'doc.body.innerHTML = SessionHtml
'doc.write(SessionHtml)
往往只是第一次成功,而且中间会涉及到html内双引号的问题.
也有网上说按如下方法:
''在WebBrowser中显示报告内容字段
'Dim doc As IHTMLDocument2 = CType(AxWebBrowserFill.Document, IHTMLDocument2)
'Dim bodyElement As IHTMLElement = CType(doc.body, IHTMLElement)
''bodyElement.innerHTML = SessionHtml + 'haga'
而这个方法我就没有奏效过!