Ⅰ java程序讀取一個url頁面的源代碼
傳入一個url,返回源代碼; public static String getHTML(String url){// 獲取指定URL的網頁,返回網頁內容的字元串,然後將此字元串存到文件即可 try { URL newUrl = new URL(url); URLConnection connect = newUrl.openConnection(); connect.setRequestProperty("User-Agent","Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)"); DataInputStream dis = new DataInputStream(connect.getInputStream()); BufferedReader in = new BufferedReader(new InputStreamReader(dis,"UTF-8")); String html = ""; String readLine = null; while((readLine = in.readLine()) != null) { html = html + readLine; } in.close(); return html; }catch (MalformedURLException me){ System.out.println("MalformedURLException" + me); }catch (IOException ioe){ System.out.println("ioeException" + ioe); } return null; }
Ⅱ java中讀取網頁源代碼時,使用readline函數的問題
這也需要解釋,readLine()是中斷式的唄,每次運行到這一行都要讀取到下一行之後才會繼續後面的程序
readLine是中斷式的你沒有辦法解決,再說也不是你的問題,是對方伺服器或者是網路的問題,我們能做的只是設置一個timeout,時間過了,提示讀取失敗,可以試試apache的HttpClient
Ⅲ 求一份基於JAVA的即時通訊軟體的源碼,必須要完整的放在eclipse下可直接運行的,最好不要有錯誤,簡單就行
商品社會拒絕白乾,第一,沒有人會給你這個程序。第二,就算有人給你,他也不會耐心告訴你怎麼配置,怎麼裝資料庫。怎麼運行。 隨隨便便來這里就想白要,你覺得可能嗎。我說話可能不好聽,但是這是事實。
Ⅳ 設定一個程序 下載由url指定的網頁源代碼 指出其中所有超鏈接
publicclassTestReg{
/**多次使用的使用不需要重新編譯正則表達式了,對於頻繁調用能提高效率
*
*
**/
// ="<[aA]\s*(href=[^>]+)>(.*?)</[aA]>";
="(http://[^>]+)"";
publicstaticPatternpattern1=Pattern.compile(patternString1,Pattern.DOTALL);
/**
*@paramargs
*/
publicstaticvoidmain(String[]args){
/**測試的數據*/
Stringss="http://music..com/song/602998?fm=altg5";
List<String>urls=getWebCon(ss);
for(Iteratoriterator=urls.iterator();iterator.hasNext();){
System.out.println(iterator.next().toString());
}
}
publicstaticStringparseUrl(Stringvar)
{
Matchermatcher=null;
StringBuffersb=newStringBuffer();
matcher=pattern1.matcher(var);
while(matcher!=null&&matcher.find())
{
inta=matcher.groupCount();
while((a--)>0)
{
Stringss=matcher.group(a);
sb.append(ss.substring(0,ss.indexOf(""")));
}
}
returnsb.toString();
}
publicstaticList<String>getWebCon(Stringdomain){
List<String>sb=newArrayList<String>();
try{
java.net.URLurl=newjava.net.URL(domain);
BufferedReaderin=newBufferedReader(newInputStreamReader(url
.openStream()));
Stringline="";
while((line=in.readLine())!=null){
line=newString(line.getBytes(),"UTF-8");
if(parseUrl(line)!=null&&parseUrl(line).length()>0){
sb.add(parseUrl(line));
}
}
in.close();
}catch(Exceptione){
System.err.println(e);
}
returnsb;
}
}
Ⅳ c#為什麼str = sr.ReadLine();未能讀取整行內容
分隔時出錯,所以"123"沒有被讀取。解決方法:
1.把文本每行里的空格改為1個空格(讀取時注意編碼),比如:張三 123
2.用正則獲取。
還有,你應該是先把流close再return的,不然執行不了流的close.
Ⅵ perl運行時出現報錯下面是源代碼和報錯內容,
首先你將這兩句
open IN,"1CHR.txt";
open OUTemp1,'>',"OutPDB.txt";
改成
open IN, "1CHR.txt" or die "Can't read file";
open OUTemp1, ">OutPDB.txt" or die "Can't create OUT";
看看有沒有報出甚麼錯 ?
Ⅶ java.lang.NullPointerException at java.util.Properties$LineReader.readLine(Unknown Source) at java.
找不到配置.properties文件的位置。
Ⅷ java中如何根據一個網址獲得該網頁的源代碼,急求
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
public class HttpTest {
String urlString;
public static void main(String[] args) throws Exception {
HttpTest client = new HttpTest(網址);
client.run();
}
public HttpTest(String urlString) {
this.urlString = urlString;
}
public void run() throws Exception {
//生成一個URL對象
URL url = new URL(urlString);
//打開URL
HttpURLConnection urlConnection = (HttpURLConnection) url.openConnection();
//得到輸入流,即獲得了網頁的內容
BufferedReader reader = new BufferedReader(new InputStreamReader(urlConnection
.getInputStream()));
String line;
// 讀取輸入流的數據,並顯示
while ((line = reader.readLine()) != null){
System.out.println(line);
}
}
}
Ⅸ 求樸素貝葉斯演算法源碼
ICTCLAS中文分詞for Lucene.Net介面代碼(實現Analyzer):
1using System;
2using System.Collections.Generic;
3using System.Text;
4using System.IO;
5
6using Lucene.Net.Analysis;
7using Lucene.Net.Analysis.Standard;
8
9namespace AspxOn.Search.FenLei
10{
11
12 /**//// <summary>
13 /// ICTCLAS分片語件for Lucene.net介面
14 /// </summary>
15 public class ICTCLASAnalyzer : Analyzer
16 {
17 //定義要過濾的詞
18 public static readonly System.String[] CHINESE_ENGLISH_STOP_WORDS = new string[428];
19 public string NoisePath = Environment.CurrentDirectory + "\\data\\stopwords.txt";
20
21 public ICTCLASAnalyzer()
22 {
23 StreamReader reader = new StreamReader(NoisePath, System.Text.Encoding.Default);
24 string noise = reader.ReadLine();
25 int i = 0;
26
27 while (!string.IsNullOrEmpty(noise))
28 {
29 CHINESE_ENGLISH_STOP_WORDS[i] = noise;
30 noise = reader.ReadLine();
31 i++;
32 }
33
34 }
35
36 /**//**//**//// Constructs a {@link StandardTokenizer} filtered by a {@link
37 /// StandardFilter}, a {@link LowerCaseFilter} and a {@link StopFilter}.
38 ///
39 public override TokenStream TokenStream(System.String fieldName, System.IO.TextReader reader)
40 {
41 TokenStream result = new ICTCLASTokenizer(reader);
42 result = new StandardFilter(result);
43 result = new LowerCaseFilter(result);
44 result = new StopFilter(result, CHINESE_ENGLISH_STOP_WORDS);
45 return result;
46 }
47
48
49 }
50}
ICTCLAS中文分詞for Lucene.Net介面代碼(實現Tokenizer):
1using System;
2using System.Collections.Generic;
3using System.Text;
4
5using Lucene.Net.Analysis;
6using SharpICTCLAS;
7using System.IO;
8
9namespace AspxOn.Search.FenLei
10{
11 public class ICTCLASTokenizer : Tokenizer
12 {
13 int nKind = 1;
14 List<WordResult[]> result;
15 int startIndex = 0;
16 int endIndex = 0;
17 int i = 1;
18 /**//**/
19 /**////
20 /// 待分詞的句子
21 ///
22 private string sentence;
23 /**//**/
24 /**//// Constructs a tokenizer for this Reader.
25 public ICTCLASTokenizer(System.IO.TextReader reader)
26 {
27 this.input = reader;
28 sentence = input.ReadToEnd();
29 sentence = sentence.Replace("\r\n", "");
30 string DictPath = Path.Combine(Environment.CurrentDirectory, "Data") + Path.DirectorySeparatorChar;
31 //Console.WriteLine("正在初始化字典庫,請稍候");
32 WordSegment wordSegment = new WordSegment();
33 wordSegment.InitWordSegment(DictPath);
34 result = wordSegment.Segment(sentence, nKind);
35 }
36
37 /**//**/
38 /**//// 進行切詞,返回數據流中下一個token或者數據流為空時返回null
39 ///
40 public override Token Next()
41 {
42 Token token = null;
43 while (i < result[0].Length - 1)
44 {
45 string word = result[0][i].sWord;
46 endIndex = startIndex + word.Length - 1;
47 token = new Token(word, startIndex, endIndex);
48 startIndex = endIndex + 1;
49
50 i++;
51 return token;
52
53 }
54 return null;
55 }
56
57 }
58}
中文分詞器代碼:
1using System;
2using System.Collections.Generic;
3using System.Text;
4using System.IO;
5
6using Lucene.Net.Analysis;
7using Lucene.Net.Analysis.Standard;
8using Lucene.Net.Documents;
9
10using Lucene.Net.Analysis.Cn;
11using Lucene.Net.Analysis.KTDictSeg;
12
13namespace AspxOn.Search.FenLei
14{
15 /**//// <summary>
16 /// 中文分詞器
17 /// </summary>
18 public class ChineseSpliter
19 {
20 public static string Split(string text, string splitToken)
21 {
22 StringBuilder sb = new StringBuilder();
23
24 Analyzer an = new ICTCLASAnalyzer();
25
26 //TokenStream ts = an.ReusableTokenStream("", new StringReader(text));
27
28 TokenStream ts = an.TokenStream("", new StringReader(text));
29
30 Lucene.Net.Analysis.Token token;
31 while ((token = ts.Next()) != null)
32 {
33 sb.Append(splitToken + token.TermText());
34 }
35
36 return sb.ToString().Substring(1);
37 }
38 }
39}
先驗概率計算代碼:
1using System;
2using System.Collections.Generic;
3using System.Text;
4
5namespace AspxOn.Search.FenLei
6{
7 /**//// <summary>
8 /// 先驗概率(事先概率)計算
9 /// </summary>
10 public class PriorProbability
11 {
12 private static TrainingDataManager tdm = new TrainingDataManager();
13
14 /**//// <summary>
15 /// 計算先驗概率
16 /// </summary>
17 /// <param name="c">給定的分類</param>
18 /// <returns>給定條件下的先驗概率</returns>
19 public static float CaculatePc(string c)
20 {
21 float ret = 0F;
22 float Nc = tdm.(c);
23 float N = tdm.GetTrainFileCount();
24 ret = Nc / N;
25 return ret;
26 }
27 }
28}
1using System;
2using System.Collections.Generic;
3using System.Text;
4
5namespace AspxOn.Search.FenLei
6{
7 /**//// <summary>
8 /// 條件概率計算
9 /// </summary>
10 public class ClassConditionalProbability
11 {
12
13 private static TrainingDataManager tdm = new TrainingDataManager();
14 private static float M = 0F;
15
16 /**//// <summary>
17 /// 類條件概率
18 /// </summary>
19 /// <param name="x">給定關鍵字</param>
20 /// <param name="c">給定分類</param>
21 /// <returns></returns>
22 public static float CaculatePxc(string x, string c)
23 {
24 float ret = 0F;
25 float Nxc = tdm.(c, x);
26 float Nc = tdm.(c);
27 float V = tdm.GetTrainingClassifications().Length;
28
29 ret = (Nxc + 1) / (Nc + V + M);//為避免出現0這樣的極端情況,進行加權處理
30
31 return ret;
32 }
33 }
34}
Ⅹ 求一個網站計數器源代碼,不要附帶網站鏈接的那種,真正免費的.
這是計數器的代碼,可以從網上搜一下,有不同的解決方法。
<%
CountFile=Server.MapPath("txtcounter.txt")
Set FileObject=Server.CreateObject("Scripting.FileSystemObject")
Set Out=FileObject.OpenTextFile(CountFile,1,FALSE,FALSE)
counter=Out.ReadLine
Out.Close
SET FileObject=Server.CreateObject("Scripting.FileSystemObject")
Set Out=FileObject.CreateTextFile(CountFile,TRUE,FALSE)
Application.lock
counter= counter + 1
Out.WriteLine(counter)
Application.unlock
Response.Write"document.write("&counter&")"
Out.Close
%>
然後在同一個文件夾下建立一個txtcounter.txt的文本文件,在裡面填上數字,隨便什麼都行,就從填入的這個數據開始計數