導航:首頁 > 編程語言 > pythonmilliseconds

pythonmilliseconds

發布時間:2024-12-15 11:48:50

① 電腦-Sphinx在windows下安裝使用[支持中文全文檢索]

前一陣子嘗試使用了一下Sphinx,一個能夠被各種語言(php/python/Ruby/etc)方便調用的全文檢索系統。網上的資料大多是在linux環境下的安裝使用,當然,作為生產環境很有必要部署在*nix環境下,作為學習測試,還是windows環境比較方便些。

本文旨在提供一種便捷的方式讓Sphinx在windows下安裝配置以支持中文全文檢索,配置部分在linux下通用。

一、關於Sphinx

Sphinx 是一個在GPLv2 下發布的一個全文檢索引擎,商業授權(例如, 嵌入到其他程序中)需要聯系作者(Sphinxsearch.com)以獲得商業授權。

一般而言,Sphinx是一個獨立的搜索引擎,意圖為其他應用提供高速、低空間佔用、高結果相關度的全文搜索功能。Sphinx可以非常容易的與SQL資料庫和腳本語言集成。

當前系統內置MySQL和PostgreSQL 資料庫數據源的支持,也支持從標准輸入讀取特定格式的XML數據。通過修改源代碼,用戶可以自行增加新的數據源(例如:其他類型的DBMS的原生支持)。

搜索API支持PHP、Python、Perl、Rudy和Java,並且也可以用作MySQL存儲引擎。搜索API非常簡單,可以在若干個小時之內移植到新的語言上。

Sphinx特性:

高速的建立索引(在當代CPU上,峰值性能可達到10MB/秒); 高性能的搜索(在2–4GB的文本數據上,平均每次檢索響應時間小於0.1秒); 可處理海量數據(目前已知可以處理超過100GB的文本數據,在單一CPU的系統上可處理100M文檔); 提供了優秀的相關度演算法,基於短語相似度和統計(BM25)的復合Ranking方法; 支持分布式搜索; 提供文件的摘錄生成; 可作為MySQL的存儲引擎提供搜索服務; 支持布爾、短語、詞語相似度等多種檢索模式; 文檔支持多個全文檢索欄位(最大不超過32個); 文檔支持多個額外的`屬性信息(例如:分組信息,時間戳等); 停止詞查詢; 支持單一位元組編碼和UTF-8編碼; 原生的MySQL支持(同時支持MyISAM和InnoDB); 原生的PostgreSQL支持.

中文手冊可以在這里獲得(酷勤網備用下載地址:sphinx_doc_zhcn_0.9.pdf)。

二、Sphinx在windows上的安裝

1.直接在http://www.sphinxsearch.com/downloads.html找到最新的windows版本,我這里下的是Win32 release binaries with MySQL support,下載後解壓在D:sphinx目錄下;

2.在D:sphinx下新建一個data目錄用來存放索引文件,一個log目錄方日誌文件,復制D:sphinxsphinx.conf.in到D:sphinxbinsphinx.conf(注意修改文件名);

3.修改D:sphinxbinsphinx.conf,我這里列出需要修改的幾個:

type= mysql # 數據源,我這里是mysqlsql_host;= localhost # 資料庫伺服器sql_user;= root # 資料庫用戶名sql_pass;=;'' # 資料庫密碼sql_db;;;= test # 資料庫sql_port;= 3306 # 資料庫埠

sql_query_pre;;;= SET NAMES utf8 # 去掉此行前面的注釋,如果你的資料庫是uft8編碼的

index test1{#;放索引的目錄;path;;;= D:/sphinx/data/# 編碼;charset_type;;= utf-8;#; 指定utf-8的編碼表;charset_table=0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F;# 簡單分詞,只支持0和1,如果要搜索中文,請指定為1;ngram_len;;;;= 1# 需要分詞的字元,如果要搜索中文,去掉前面的注釋;ngram_chars;;;= U+3000..U+2FA1F}

# index test1stemmed : test1# {;# path;;;= @CONFDIR@/data/test1stemmed;# morphology;;= stem_en# }# 如果沒有分布式索引,注釋掉下面的內容# index dist1# {;# 'distributed' index type MUST be specified;# type;;;;= distributed

;# local index to be searched;# there can be many local indexes configured;# local;;;;= test1;# local;;;;= test1stemmed

;# remote agent;# multiple remote agents may be specified;# syntax is 'hostname:port:index1,[index2[,...]];# agent;;;;= localhost:3313:remote1;# agent;;;;= localhost:3314:remote2,remote3

;# remote agent connection timeout, milliseconds;# optional, default is 1000 ms, ie. 1 sec;# agent_connect_timeout;= 1000

;# remote agent query timeout, milliseconds;# optional, default is 3000 ms, ie. 3 sec;# agent_query_timeout;;= 3000# }

# 搜索服務需要修改的部分searchd{;# 日誌;log;;;;;= D:/sphinx/log/searchd.log

;# PID file, searchd process ID file name;pid_file;;;= D:/sphinx/log/searchd.pid

# windows下啟動searchd服務一定要注釋掉這個 # seamless_rotate;;= 1}

4.導入測試數據

C:Program FilesMySQLMySQL Server 5.0bin>mysql -uroot test<d:/sphinx/example.sql

5.建立索引

D:sphinxbin>indexer.exe test1

Sphinx 0.9.8-release (r1533)

Copyright (c) 2001-2008, Andrew Aksyonoff

using config file 『./sphinx.conf』…

indexing index 『test1′…

collected 4 docs, 0.0 MB

sorted 0.0 Mhits, 100.0% done

total 4 docs, 193 bytes

total 0.101 sec, 1916.30 bytes/sec, 39.72 docs/sec

D:sphinxbin>

6.搜索』test』試試

D:sphinxbin>search.exe test

Sphinx 0.9.8-release (r1533)

Copyright (c) 2001-2008, Andrew Aksyonoff

using config file 『./sphinx.conf』…

index 『test1′: query 『test 『: returned 3 matches of 3 total in 0.000 sec

displaying matches:

1. document=1, weight=2, group_id=1, date_added=Wed Nov 26 14:58:59 2008

;;;;;id=1

;;;;;group_id=1

;;;;;group_id2=5

;;;;;date_added=2008-11-26 14:58:59

;;;;;title=test one

;;;;;content=this is my test document number one. also checking search within

;phrases.

2. document=2, weight=2, group_id=1, date_added=Wed Nov 26 14:58:59 2008

;;;;;id=2

;;;;;group_id=1

;;;;;group_id2=6

;;;;;date_added=2008-11-26 14:58:59

;;;;;title=test two

;;;;;content=this is my test document number two

3. document=4, weight=1, group_id=2, date_added=Wed Nov 26 14:58:59 2008

;;;;;id=4

;;;;;group_id=2

;;;;;group_id2=8

;;;;;date_added=2008-11-26 14:58:59

;;;;;title=doc number four

;;;;;content=this is to test groups

words:

1. 『test』: 3 documents, 5 hits

D:sphinxbin>

都所出來了吧。

6.測試中文搜索

修改test資料庫中documents數據表,

UPDATE `test`.`documents` SET `title` = 『測試中文』, `content` = 『this is my test document number two,應該搜的到吧』 WHERE `documents`.`id` = 2;

重建索引:

D:sphinxbin>indexer.exe –all

搜索』中文』試試:

D:sphinxbin>search.exe 中文

Sphinx 0.9.8-release (r1533)

Copyright (c) 2001-2008, Andrew Aksyonoff

using config file 『./sphinx.conf』…

index 『test1′: query 『中文 『: returned 0 matches of 0 total in 0.000 sec

words:

D:sphinxbin>

貌似沒有搜到,這是因為windows命令行中的編碼是gbk,當然搜不出來。我們可以用程序試試,在D:sphinxapi下新建一個foo.php的文件,注意utf-8編碼

<?php

require 』sphinxapi.php』;

$s = new SphinxClient();

$s->SetServer(』localhost』,3312);

$result = $s->Query(』中文』);

var_mp($result);

?>

啟動Sphinx searchd服務

D:sphinxbin>searchd.exe

Sphinx 0.9.8-release (r1533)

Copyright (c) 2001-2008, Andrew Aksyonoff

WARNING: forcing –console mode on Windows

using config file 『./sphinx.conf』…

creating server socket on 0.0.0.0:3312

accepting connections

執行PHP查詢:

php d:/sphinx/api/foo.php

結果是不是出來?剩下的工作就是去看手冊,慢慢摸索高階的配置。

② 怎麼用Python實現時間加減運算

使用timedelta就可以直接進行運算。
datetime.timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)
timedelta可以傳入天數、小時、分、秒、星期、毫秒等。

③ 求python將兩個MP3音頻文件拼接成一個MP3文件的代碼

可以使用pyb

1 網址:https://github.com/jiaaro/pyb

2 pyb需要依賴 libav或者ffmpeg

3 在mac環境下安裝依賴:(二選一)

[plain]view plain

brewinstalllibav--with-libvorbis--with-sdl--with-theora

將所有依賴都安裝上~~

brewinstallffmpeg--with-fdk-aac--with-ffplay--with-freetype--with-frei0r--with-libass--with-libvo-aacenc--with-libvorbis--with-libvpx--with-opencore-amr--with-openjpeg--with-opus--with-rtmpmp--with-schroedinger--with-speex--with-theora--with-tools--with-fdk-aac--with-freetype--with-ffplay--with-ffplay--with-freetype--with-frei0r--with-libass--with-libbluray--with-libcaca--with-libquvi--with-libvidstab--with-libvo-aacenc--with-libvorbis--with-libvpx--with-opencore-amr--with-openjpeg--with-openssl--with-opus--with-rtmpmp--with-schroedinger--with-speex--with-theora--with-tools--with-x265

4 安裝pyb: pip install pyb

5 使用pyb:

下載是示代碼

enPath="%s%s/%s"%(enDir,file,enfile)#英文文件的路徑
cnPath="%s%s/%s"%(cnDir,file,enfile.replace("en_w","cn_w"))#中文文件的路徑
targetPath="%s%s/%s"%(toDir,file,enfile.replace("en_w","all"))#合並文件的路徑
#載入MP3文件
song1=AudioSegment.from_mp3(enPath)
song2=AudioSegment.from_mp3(cnPath)

#取得兩個MP3文件的聲音分貝
db1=song1.dBFS
db2=song2.dBFS

song1=song1[300:]#從300ms開始截取英文MP3

#調整兩個MP3的聲音大小,防止出現一個聲音大一個聲音小的情況
dbplus=db1-db2
ifdbplus<0:#song1的聲音更小
song1+=abs(dbplus)
elifdbplus>0:#song2的聲音更小
song2+=abs(dbplus)

#拼接兩個音頻文件
song=song1+song2

#導出音頻文件
song.export(targetPath,format="mp3")#導出為MP3格式

④ Python中SQLite支持資料庫遠程訪問嗎

使用自己的文件鎖解決這個問題。
Multiple processes can have the same database open at the same time. Multiple processes can be doing a SELECT at the same time. But only one process can be making changes to the database at any moment in time, however.
SQLite uses reader/writer locks to control access to the database. (Under Win95/98/ME which lacks support for reader/writer locks, a probabilistic simulation is used instead.) But use caution: this locking mechanism might not work correctly if the database file is kept on an NFS filesystem. This is because fcntl() file locking is broken on many NFS implementations. You should avoid putting SQLite database files on NFS if multiple processes might try to access the file at the same time. On Windows, Microsoft's documentation says that locking may not work under FAT filesystems if you are not running the Share.exe daemon. People who have a lot of experience with Windows tell me that file locking of network files is very buggy and is not dependable. If what they say is true, sharing an SQLite database between two or more Windows machines might cause unexpected problems.
We are aware of no other embedded SQL database engine that supports as much concurrency as SQLite. SQLite allows multiple processes to have the database file open at once, and for multiple processes to read the database at once. When any process wants to write, it must lock the entire database file for the ration of its update. But that normally only takes a few milliseconds. Other processes just wait on the writer to finish then continue about their business. Other embedded SQL database engines typically only allow a single process to connect to the database at once.
However, client/server database engines (such as PostgreSQL, MySQL, or Oracle) usually support a higher level of concurrency and allow multiple processes to be writing to the same database at the same time. This is possible in a client/server database because there is always a single well-controlled server process available to coordinate access. If your application has a need for a lot of concurrency, then you should consider using a client/server database. But experience suggests that most applications need much less concurrency than their designers imagine.
When SQLite tries to access a file that is locked by another process, the default behavior is to return SQLITE_BUSY. You can adjust this behavior from C code using the sqlite3_busy_handler() or sqlite3_busy_timeout() API functions.
qlite應該是只是一個本地文件,API放在各個語言的開發包里了,它本身不具備C/S的網路功能。
見官方文檔:
「 If you have many client programs accessing a common database over a network, you should consider using a client/server database engine instead of SQLite.」
如果一定想支持遠程訪問有這么幾條出路:
1、換其他支持網路訪問的資料庫如MySQL。
如果堅持要用Sqlite
2、樓上所述,用網路文件系統,但是不建議。因為隨機讀寫在NFS等系統上的性能都很成問題,而且穩定性堪憂。
3、用RPC等封裝一下,如Thrift、XML-RPC等,Java的話還有RMI等直接可以搞起。

⑤ python編程,使用Tkinter中的文本框顯示系統時間

Python編程中,用Tkinter中的文本框獲取系統當前的時間並且顯示,代碼如下:

importsys
fromtkinterimport*
importtime
deftick():
globaltime1
#從運行程序的計算機上面獲取當前的系統時間
time2=time.strftime('%H:%M:%S')
#如果時間發生變化,代碼自動更新顯示的系統時間
iftime2!=time1:
time1=time2
clock.config(text=time2)
#
#
#coulse>200ms,butdisplaygetsjerky
clock.after(200,tick)
root=Tk()
time1=''
status=Label(root,text="v1.0",bd=1,relief=SUNKEN,anchor=W)
status.grid(row=0,column=0)
clock=Label(root,font=('times',20,'bold'),bg='green')
clock.grid(row=0,column=1)
tick()
root.mainloop()
閱讀全文

與pythonmilliseconds相關的資料

熱點內容
ubuntu查看cpu命令 瀏覽:190
用photoshop壓縮圖片 瀏覽:295
伺服器不可以用文件夾連 瀏覽:702
程序員做生意 瀏覽:478
文件移動到另一個文件夾中 瀏覽:333
逗比程序員搞笑 瀏覽:700
程序員熱議話題大全 瀏覽:250
pdf對比器 瀏覽:798
安卓抖音懸浮窗口怎麼取消 瀏覽:146
python界面代碼下載 瀏覽:407
解壓文件到文件夾 瀏覽:704
python自動下單 瀏覽:780
另保存圖片在哪個文件夾 瀏覽:561
伺服器如何分成主機 瀏覽:270
下載的app怎麼刪除 瀏覽:304
pythonmilliseconds 瀏覽:657
橢圓曲線演算法實現 瀏覽:446
為什麼伺服器尚未就緒 瀏覽:66
java認證培訓 瀏覽:441
特徵演算法計算公式 瀏覽:750