导航:首页 > 编程语言 > pythonmilliseconds

pythonmilliseconds

发布时间:2024-12-15 11:48:50

① 电脑-Sphinx在windows下安装使用[支持中文全文检索]

前一阵子尝试使用了一下Sphinx,一个能够被各种语言(php/python/Ruby/etc)方便调用的全文检索系统。网上的资料大多是在linux环境下的安装使用,当然,作为生产环境很有必要部署在*nix环境下,作为学习测试,还是windows环境比较方便些。

本文旨在提供一种便捷的方式让Sphinx在windows下安装配置以支持中文全文检索,配置部分在linux下通用。

一、关于Sphinx

Sphinx 是一个在GPLv2 下发布的一个全文检索引擎,商业授权(例如, 嵌入到其他程序中)需要联系作者(Sphinxsearch.com)以获得商业授权。

一般而言,Sphinx是一个独立的搜索引擎,意图为其他应用提供高速、低空间占用、高结果相关度的全文搜索功能。Sphinx可以非常容易的与SQL数据库和脚本语言集成。

当前系统内置MySQL和PostgreSQL 数据库数据源的支持,也支持从标准输入读取特定格式的XML数据。通过修改源代码,用户可以自行增加新的数据源(例如:其他类型的DBMS的原生支持)。

搜索API支持PHP、Python、Perl、Rudy和Java,并且也可以用作MySQL存储引擎。搜索API非常简单,可以在若干个小时之内移植到新的语言上。

Sphinx特性:

高速的建立索引(在当代CPU上,峰值性能可达到10MB/秒); 高性能的搜索(在2–4GB的文本数据上,平均每次检索响应时间小于0.1秒); 可处理海量数据(目前已知可以处理超过100GB的文本数据,在单一CPU的系统上可处理100M文档); 提供了优秀的相关度算法,基于短语相似度和统计(BM25)的复合Ranking方法; 支持分布式搜索; 提供文件的摘录生成; 可作为MySQL的存储引擎提供搜索服务; 支持布尔、短语、词语相似度等多种检索模式; 文档支持多个全文检索字段(最大不超过32个); 文档支持多个额外的`属性信息(例如:分组信息,时间戳等); 停止词查询; 支持单一字节编码和UTF-8编码; 原生的MySQL支持(同时支持MyISAM和InnoDB); 原生的PostgreSQL支持.

中文手册可以在这里获得(酷勤网备用下载地址:sphinx_doc_zhcn_0.9.pdf)。

二、Sphinx在windows上的安装

1.直接在http://www.sphinxsearch.com/downloads.html找到最新的windows版本,我这里下的是Win32 release binaries with MySQL support,下载后解压在D:sphinx目录下;

2.在D:sphinx下新建一个data目录用来存放索引文件,一个log目录方日志文件,复制D:sphinxsphinx.conf.in到D:sphinxbinsphinx.conf(注意修改文件名);

3.修改D:sphinxbinsphinx.conf,我这里列出需要修改的几个:

type= mysql # 数据源,我这里是mysqlsql_host;= localhost # 数据库服务器sql_user;= root # 数据库用户名sql_pass;=;'' # 数据库密码sql_db;;;= test # 数据库sql_port;= 3306 # 数据库端口

sql_query_pre;;;= SET NAMES utf8 # 去掉此行前面的注释,如果你的数据库是uft8编码的

index test1{#;放索引的目录;path;;;= D:/sphinx/data/# 编码;charset_type;;= utf-8;#; 指定utf-8的编码表;charset_table=0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F;# 简单分词,只支持0和1,如果要搜索中文,请指定为1;ngram_len;;;;= 1# 需要分词的字符,如果要搜索中文,去掉前面的注释;ngram_chars;;;= U+3000..U+2FA1F}

# index test1stemmed : test1# {;# path;;;= @CONFDIR@/data/test1stemmed;# morphology;;= stem_en# }# 如果没有分布式索引,注释掉下面的内容# index dist1# {;# 'distributed' index type MUST be specified;# type;;;;= distributed

;# local index to be searched;# there can be many local indexes configured;# local;;;;= test1;# local;;;;= test1stemmed

;# remote agent;# multiple remote agents may be specified;# syntax is 'hostname:port:index1,[index2[,...]];# agent;;;;= localhost:3313:remote1;# agent;;;;= localhost:3314:remote2,remote3

;# remote agent connection timeout, milliseconds;# optional, default is 1000 ms, ie. 1 sec;# agent_connect_timeout;= 1000

;# remote agent query timeout, milliseconds;# optional, default is 3000 ms, ie. 3 sec;# agent_query_timeout;;= 3000# }

# 搜索服务需要修改的部分searchd{;# 日志;log;;;;;= D:/sphinx/log/searchd.log

;# PID file, searchd process ID file name;pid_file;;;= D:/sphinx/log/searchd.pid

# windows下启动searchd服务一定要注释掉这个 # seamless_rotate;;= 1}

4.导入测试数据

C:Program FilesMySQLMySQL Server 5.0bin>mysql -uroot test<d:/sphinx/example.sql

5.建立索引

D:sphinxbin>indexer.exe test1

Sphinx 0.9.8-release (r1533)

Copyright (c) 2001-2008, Andrew Aksyonoff

using config file ‘./sphinx.conf’…

indexing index ‘test1′…

collected 4 docs, 0.0 MB

sorted 0.0 Mhits, 100.0% done

total 4 docs, 193 bytes

total 0.101 sec, 1916.30 bytes/sec, 39.72 docs/sec

D:sphinxbin>

6.搜索’test’试试

D:sphinxbin>search.exe test

Sphinx 0.9.8-release (r1533)

Copyright (c) 2001-2008, Andrew Aksyonoff

using config file ‘./sphinx.conf’…

index ‘test1′: query ‘test ‘: returned 3 matches of 3 total in 0.000 sec

displaying matches:

1. document=1, weight=2, group_id=1, date_added=Wed Nov 26 14:58:59 2008

;;;;;id=1

;;;;;group_id=1

;;;;;group_id2=5

;;;;;date_added=2008-11-26 14:58:59

;;;;;title=test one

;;;;;content=this is my test document number one. also checking search within

;phrases.

2. document=2, weight=2, group_id=1, date_added=Wed Nov 26 14:58:59 2008

;;;;;id=2

;;;;;group_id=1

;;;;;group_id2=6

;;;;;date_added=2008-11-26 14:58:59

;;;;;title=test two

;;;;;content=this is my test document number two

3. document=4, weight=1, group_id=2, date_added=Wed Nov 26 14:58:59 2008

;;;;;id=4

;;;;;group_id=2

;;;;;group_id2=8

;;;;;date_added=2008-11-26 14:58:59

;;;;;title=doc number four

;;;;;content=this is to test groups

words:

1. ‘test’: 3 documents, 5 hits

D:sphinxbin>

都所出来了吧。

6.测试中文搜索

修改test数据库中documents数据表,

UPDATE `test`.`documents` SET `title` = ‘测试中文’, `content` = ‘this is my test document number two,应该搜的到吧’ WHERE `documents`.`id` = 2;

重建索引:

D:sphinxbin>indexer.exe –all

搜索’中文’试试:

D:sphinxbin>search.exe 中文

Sphinx 0.9.8-release (r1533)

Copyright (c) 2001-2008, Andrew Aksyonoff

using config file ‘./sphinx.conf’…

index ‘test1′: query ‘中文 ‘: returned 0 matches of 0 total in 0.000 sec

words:

D:sphinxbin>

貌似没有搜到,这是因为windows命令行中的编码是gbk,当然搜不出来。我们可以用程序试试,在D:sphinxapi下新建一个foo.php的文件,注意utf-8编码

<?php

require ’sphinxapi.php’;

$s = new SphinxClient();

$s->SetServer(’localhost’,3312);

$result = $s->Query(’中文’);

var_mp($result);

?>

启动Sphinx searchd服务

D:sphinxbin>searchd.exe

Sphinx 0.9.8-release (r1533)

Copyright (c) 2001-2008, Andrew Aksyonoff

WARNING: forcing –console mode on Windows

using config file ‘./sphinx.conf’…

creating server socket on 0.0.0.0:3312

accepting connections

执行PHP查询:

php d:/sphinx/api/foo.php

结果是不是出来?剩下的工作就是去看手册,慢慢摸索高阶的配置。

② 怎么用Python实现时间加减运算

使用timedelta就可以直接进行运算。
datetime.timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)
timedelta可以传入天数、小时、分、秒、星期、毫秒等。

③ 求python将两个MP3音频文件拼接成一个MP3文件的代码

可以使用pyb

1 网址:https://github.com/jiaaro/pyb

2 pyb需要依赖 libav或者ffmpeg

3 在mac环境下安装依赖:(二选一)

[plain]view plain

brewinstalllibav--with-libvorbis--with-sdl--with-theora

将所有依赖都安装上~~

brewinstallffmpeg--with-fdk-aac--with-ffplay--with-freetype--with-frei0r--with-libass--with-libvo-aacenc--with-libvorbis--with-libvpx--with-opencore-amr--with-openjpeg--with-opus--with-rtmpmp--with-schroedinger--with-speex--with-theora--with-tools--with-fdk-aac--with-freetype--with-ffplay--with-ffplay--with-freetype--with-frei0r--with-libass--with-libbluray--with-libcaca--with-libquvi--with-libvidstab--with-libvo-aacenc--with-libvorbis--with-libvpx--with-opencore-amr--with-openjpeg--with-openssl--with-opus--with-rtmpmp--with-schroedinger--with-speex--with-theora--with-tools--with-x265

4 安装pyb: pip install pyb

5 使用pyb:

下载是示代码

enPath="%s%s/%s"%(enDir,file,enfile)#英文文件的路径
cnPath="%s%s/%s"%(cnDir,file,enfile.replace("en_w","cn_w"))#中文文件的路径
targetPath="%s%s/%s"%(toDir,file,enfile.replace("en_w","all"))#合并文件的路径
#加载MP3文件
song1=AudioSegment.from_mp3(enPath)
song2=AudioSegment.from_mp3(cnPath)

#取得两个MP3文件的声音分贝
db1=song1.dBFS
db2=song2.dBFS

song1=song1[300:]#从300ms开始截取英文MP3

#调整两个MP3的声音大小,防止出现一个声音大一个声音小的情况
dbplus=db1-db2
ifdbplus<0:#song1的声音更小
song1+=abs(dbplus)
elifdbplus>0:#song2的声音更小
song2+=abs(dbplus)

#拼接两个音频文件
song=song1+song2

#导出音频文件
song.export(targetPath,format="mp3")#导出为MP3格式

④ Python中SQLite支持数据库远程访问吗

使用自己的文件锁解决这个问题。
Multiple processes can have the same database open at the same time. Multiple processes can be doing a SELECT at the same time. But only one process can be making changes to the database at any moment in time, however.
SQLite uses reader/writer locks to control access to the database. (Under Win95/98/ME which lacks support for reader/writer locks, a probabilistic simulation is used instead.) But use caution: this locking mechanism might not work correctly if the database file is kept on an NFS filesystem. This is because fcntl() file locking is broken on many NFS implementations. You should avoid putting SQLite database files on NFS if multiple processes might try to access the file at the same time. On Windows, Microsoft's documentation says that locking may not work under FAT filesystems if you are not running the Share.exe daemon. People who have a lot of experience with Windows tell me that file locking of network files is very buggy and is not dependable. If what they say is true, sharing an SQLite database between two or more Windows machines might cause unexpected problems.
We are aware of no other embedded SQL database engine that supports as much concurrency as SQLite. SQLite allows multiple processes to have the database file open at once, and for multiple processes to read the database at once. When any process wants to write, it must lock the entire database file for the ration of its update. But that normally only takes a few milliseconds. Other processes just wait on the writer to finish then continue about their business. Other embedded SQL database engines typically only allow a single process to connect to the database at once.
However, client/server database engines (such as PostgreSQL, MySQL, or Oracle) usually support a higher level of concurrency and allow multiple processes to be writing to the same database at the same time. This is possible in a client/server database because there is always a single well-controlled server process available to coordinate access. If your application has a need for a lot of concurrency, then you should consider using a client/server database. But experience suggests that most applications need much less concurrency than their designers imagine.
When SQLite tries to access a file that is locked by another process, the default behavior is to return SQLITE_BUSY. You can adjust this behavior from C code using the sqlite3_busy_handler() or sqlite3_busy_timeout() API functions.
qlite应该是只是一个本地文件,API放在各个语言的开发包里了,它本身不具备C/S的网络功能。
见官方文档:
“ If you have many client programs accessing a common database over a network, you should consider using a client/server database engine instead of SQLite.”
如果一定想支持远程访问有这么几条出路:
1、换其他支持网络访问的数据库如MySQL。
如果坚持要用Sqlite
2、楼上所述,用网络文件系统,但是不建议。因为随机读写在NFS等系统上的性能都很成问题,而且稳定性堪忧。
3、用RPC等封装一下,如Thrift、XML-RPC等,Java的话还有RMI等直接可以搞起。

⑤ python编程,使用Tkinter中的文本框显示系统时间

Python编程中,用Tkinter中的文本框获取系统当前的时间并且显示,代码如下:

importsys
fromtkinterimport*
importtime
deftick():
globaltime1
#从运行程序的计算机上面获取当前的系统时间
time2=time.strftime('%H:%M:%S')
#如果时间发生变化,代码自动更新显示的系统时间
iftime2!=time1:
time1=time2
clock.config(text=time2)
#
#
#coulse>200ms,butdisplaygetsjerky
clock.after(200,tick)
root=Tk()
time1=''
status=Label(root,text="v1.0",bd=1,relief=SUNKEN,anchor=W)
status.grid(row=0,column=0)
clock=Label(root,font=('times',20,'bold'),bg='green')
clock.grid(row=0,column=1)
tick()
root.mainloop()
阅读全文

与pythonmilliseconds相关的资料

热点内容
程序员客户面试是驻场吗 浏览:209
手机怎么调节app大小 浏览:676
小程序摄影预约源码 浏览:972
数控车床开粗编程实例 浏览:452
浪潮服务器主板用什么机箱 浏览:648
python静态数据 浏览:705
程序员必须考c语言吗 浏览:22
g7加密狗 浏览:310
ubuntu查看cpu命令 浏览:192
用photoshop压缩图片 浏览:298
服务器不可以用文件夹连 浏览:704
程序员做生意 浏览:480
文件移动到另一个文件夹中 浏览:333
逗比程序员搞笑 浏览:700
程序员热议话题大全 浏览:251
pdf对比器 浏览:799
安卓抖音悬浮窗口怎么取消 浏览:147
python界面代码下载 浏览:408
解压文件到文件夹 浏览:705
python自动下单 浏览:781