导航:首页 > 编程语言 > scapyallpython

scapyallpython

发布时间:2023-03-20 09:41:55

❶ 为什么在python中import scapy报错

问题原因:import scapy后执行脚本调用scapy模块中(默认自动加了.py后缀)优先找了当前目录的man.py,因为两文件不一样(一个是我们引入别人写好的库文件,一个是我们自己创建的文件),所以就报错了。

一、如果一个错误出现后没有被捕获(捕获是什么先不管,现在就理解为出现了一个错误),它就一直被往上抛,最终将被Python解释器捕获。然后就在本该输出结果的地方打印一大串错误信息,然后程序退出。示例代码如下:

❷ 如何python 中运行scapy shell

启用shell
可以使用如下命令启用shell
[python] view plain
scrapy shell <url>
其中<url>就是你想抓取的页面url

使用shell
Scrapy shell可以看成是一个内置了几个有用的功能函数的python控制台程序。

功能函数
shelp() - 输出一系列可用的对象和函数
fetch(request_or_url)-从给定的url或既有的request请求对象重新生成response对象,并更新原有的相关对象
view(response)-使用浏览器打开原有的response对象(换句话说就是html页面)
Scrapy 对象
使用Scrapy shell下载指定页面的时候,会生成一些可用的对象,比如Response对象和Selector对象(Html和XML均适用)
这些可用的对象有:

crawler - 当前的Crawler对象
spider
request - 最后获取页面的请求对象
response - 一个包含最后获取页面的响应对象
sel - 最新下载页面的Selector对象
settings - 当前的Scrapy settings
Scrapy shell例子
以我的个人博客作为测试:http://blog.csdn.net/php_fly
首先,我们启动shell
[python] view plain
scrapy shell http://blog.csdn.net/php_fly --nolog
以上命令执行后,会使用Scrapy downloader下载指定url的页面数据,并且打印出可用的对象和函数列表

[python] view plain
[s] Available Scrapy objects:
[s] crawler <scrapy.crawler.Crawler object at 0x0000000002AEF7B8>
[s] item {}
[s] request <GET http://blog.csdn.net/php_fly>
[s] response <200 http://blog.csdn.net/php_fly>
[s] sel <Selector xpath=None data=u'<html xmlns="http://www.w3.org/1999/xhtm'>
[s] settings <CrawlerSettings mole=None>
[s] spider <Spider 'default' at 0x4cdb940>
[s] Useful shortcuts:
[s] shelp() Shell help (print this help)
[s] fetch(req_or_url) Fetch request (or URL) and update local objects
[s] view(response) View response in a browser
获取曾是土木人博客的文章列表超链接
[python] view plain
In [9]: sel.xpath("//span[@class='link_title']/a/@href").extract()
Out[9]:
[u'/php_fly/article/details/19364913',
u'/php_fly/article/details/18155421',
u'/php_fly/article/details/17629021',
u'/php_fly/article/details/17619689',
u'/php_fly/article/details/17386163',
u'/php_fly/article/details/17266889',
u'/php_fly/article/details/17172381',
u'/php_fly/article/details/17171985',
u'/php_fly/article/details/17145295',
u'/php_fly/article/details/17122961',
u'/php_fly/article/details/17117891',
u'/php_fly/article/details/14533681',
u'/php_fly/article/details/13162011',
u'/php_fly/article/details/12658277',
u'/php_fly/article/details/12528391',
u'/php_fly/article/details/12421473',
u'/php_fly/article/details/12319943',
u'/php_fly/article/details/12293587',
u'/php_fly/article/details/12293381',
u'/php_fly/article/details/12289803']
修改scrapy shell的请求方式:

[python] view plain
>>> request = request.replace(method="POST")
>>> fetch(request)
[s] Available Scrapy objects:
[s] crawler <scrapy.crawler.Crawler object at 0x1e16b50>
...

从Spider中调用Scrapy shell
在爬虫运行过程中,有时需要检查某个响应是否是你所期望的。
这个需求可以通过scrapy.shell.inspect_response函数进行实现
以下是一个关于如何从spider中调用scrapy shell的例子

[python] view plain
from scrapy.spider import Spider

class MySpider(Spider):
name = "myspider"
start_urls = [
"http://example.com",
"http://example.org",
"http://example.net",
]

def parse(self, response):
# We want to inspect one specific response.
if ".org" in response.url:
from scrapy.shell import inspect_response
inspect_response(response)

# Rest of parsing code.
当你启动爬虫的时候,控制台将打印出类似如下的信息
[python] view plain
2014-02-20 17:48:31-0400 [myspider] DEBUG: Crawled (200) <GET http://example.com> (referer: None)
2014-02-20 17:48:31-0400 [myspider] DEBUG: Crawled (200) <GET http://example.org> (referer: None)
[s] Available Scrapy objects:
[s] crawler <scrapy.crawler.Crawler object at 0x1e16b50>
...
>>> response.url
'http://example.org'
注意:当Scrapy engine被scrapy shell占用的时候,Scrapy shell中的fetch函数是无法使用的。 然而,当你退出Scrapy shell的时候,蜘蛛将从停止的地方继续爬行

❸ Python中scapy和socket性能优化的问题

我想不出还有什么更好解决办法了。期待高手解答。

❹ python3.2 下的抓包库。。无论是pypcap还是scapy。貌似都没有py3的版本。。跪求一个可以python3用

有一个py3kcap是pycap的封装版本,可以用于python3版本。

给你一个使用的示例代码:

#!/usr/bin/env python3.2
import ctypes,sys
from ctypes.util import find_library
#pcap = ctypes.cdll.LoadLibrary("libpcap.so")
pcap = None
if(find_library("libpcap") == None):
print("We are here!")
pcap = ctypes.cdll.LoadLibrary("libpcap.so")
else:
pcap = ctypes.cdll.LoadLibrary(find_library("libpcap"))
# required so we can access bpf_program->bf_insns
"""
struct bpf_program {
u_int bf_len;
struct bpf_insn *bf_insns;}
"""
class bpf_program(ctypes.Structure):
_fields_ = [("bf_len", ctypes.c_int),("bf_insns", ctypes.c_void_p)]
class sockaddr(ctypes.Structure):
_fields_=[("sa_family",ctypes.c_uint16),("sa_data",ctypes.c_char*14)]
class pcap_pkthdr(ctypes.Structure):
_fields_ = [("tv_sec", ctypes.c_long), ("tv_usec", ctypes.c_long), ("caplen", ctypes.c_uint), ("len", ctypes.c_uint)]

pkthdr = pcap_pkthdr()
program = bpf_program()
# prepare args
snaplen = ctypes.c_int(1500)
#buf = ctypes.c_char_p(filter)
optimize = ctypes.c_int(1)
mask = ctypes.c_uint()
net = ctypes.c_uint()
to_ms = ctypes.c_int(100000)
promisc = ctypes.c_int(1)
filter = bytes(str("port 80"), 'ascii')
buf = ctypes.c_char_p(filter)
errbuf = ctypes.create_string_buffer(256)
pcap_close = pcap.pcap_close
pcap_lookupdev = pcap.pcap_lookupdev
pcap_lookupdev.restype = ctypes.c_char_p
#pcap_lookupnet(dev, &net, &mask, errbuf)
pcap_lookupnet = pcap.pcap_lookupnet
#pcap_t *pcap_open_live(const char *device, int snaplen,int promisc, int to_ms,
#char *errbuf
pcap_open_live = pcap.pcap_open_live
#int pcap_compile(pcap_t *p, struct bpf_program *fp,const char *str, int optimize,
#bpf_u_int32 netmask)
pcap_compile = pcap.pcap_compile
#int pcap_setfilter(pcap_t *p, struct bpf_program *fp);
pcap_setfilter = pcap.pcap_setfilter
#const u_char *pcap_next(pcap_t *p, struct pcap_pkthdr *h);
pcap_next = pcap.pcap_next
# int pcap_compile_nopcap(int snaplen, int linktype, struct bpf_program *program,
# const char *buf, int optimize, bpf_u_int32 mask);
pcap_geterr = pcap.pcap_geterr
pcap_geterr.restype = ctypes.c_char_p
#check for default lookup device
dev = pcap_lookupdev(errbuf)
#override it for now ..
dev = bytes(str("wlan0"), 'ascii')
if(dev):
print("{0} is the default interface".format(dev))
else:
print("Was not able to find default interface")

if(pcap_lookupnet(dev,ctypes.byref(net),ctypes.byref(mask),errbuf) == -1):
print("Error could not get netmask for device {0}".format(errbuf))
sys.exit(0)
else:
print("Got Required netmask")
handle = pcap_open_live(dev,snaplen,promisc,to_ms,errbuf)
if(handle is False):
print("Error unable to open session : {0}".format(errbuf.value))
sys.exit(0)
else:
print("Pcap open live worked!")
if(pcap_compile(handle,ctypes.byref(program),buf,optimize,mask) == -1):
# this requires we call pcap_geterr() to get the error
err = pcap_geterr(handle)
print("Error could not compile bpf filter because {0}".format(err))
else:
print("Filter Compiled!")
if(pcap_setfilter(handle,ctypes.byref(program)) == -1):
print("Error couldn't install filter {0}".format(errbuf.value))
sys.exit(0)
else:
print("Filter installed!")
if(pcap_next(handle,ctypes.byref(pkthdr)) == -1):
err = pcap_geterr(handle)
print("ERROR pcap_next: {0}".format(err))
print("Got {0} bytes of data".format(pkthdr.len))
pcap_close(handle)

❺ python程序分析pcap文件的丢包率问题,

使用scapy、scapy_http就可以方便的对pcap包中的http数据包进行解析

#!/usr/bin/env python
try:
import scapy.all as scapy
except ImportError:
import scapy

try:
# This import works from the project directory
import scapy_http.http
except ImportError:
# If you installed this package via pip, you just need to execute this
from scapy.layers import http

packets = scapy.rdpcap('f:\\abc123.pcap')
for p in packets:
print '=' * 78
[python] view plain
#print p.show()
for f in p.payload.fields_desc:
if f.name == 'src' or f.name == 'dst':
ct = scapy.conf.color_theme
vcol = ct.field_value
fvalue = p.payload.getfieldval(f.name)
reprval = f.i2repr(p.payload,fvalue)
print "%s : %s" % (f.name, reprval)

for f in p.payload.payload.fields_desc:
if f.name == 'load':
ct = scapy.conf.color_theme
vcol = ct.field_value
fvalue = p.payload.getfieldval(f.name)
reprval = f.i2repr(p.payload,fvalue)
print "%s : %s" % (f.name, reprval)

其中,p为数据包,scapy_http将其分为:
Ethernet->TCP->RAW三个层次,

使用p.show()函数可以打印出如下结果:
###[ Ethernet ]###
dst = 02:00:00:00:00:39
src = 00:00:00:01:02:09
type = 0x800
###[ IP ]###
version = 4L
ihl = 5L
tos = 0x0
len = 1014
id = 7180
flags =
frag = 0L
ttl = 45
proto = tcp
chksum = 0xbbf9
src = 126.209.59.13
dst = 121.113.176.25
\options \
###[ Raw ]###
load = '.....'

第一层是网络层,包含源、目的mac、ip协议号,第二层是tcp层,第三层包含端口号、http报文
其中每一层均为上一层的payload成员

阅读全文

与scapyallpython相关的资料

热点内容
51单片机指令用背吗 浏览:936
unityai算法 浏览:834
我的世界ice服务器如何打开pvp 浏览:975
c语言编程如何做标记 浏览:884
python数据分析实战pdf 浏览:985
u盘插入文件夹 浏览:918
华为amd云服务器 浏览:497
汉化编程卡是什么意思 浏览:128
python学习pdf 浏览:315
祝绪丹程序员那么可爱拍吻戏 浏览:200
asp源码会员消费系统 浏览:115
java反射设置 浏览:154
python一行文 浏览:441
排序算法优缺点 浏览:565
恶搞加密文件pdf 浏览:674
gif怎么压缩图片大小 浏览:219
命令选择当前不可用 浏览:158
欧几里得算法如何求逆元 浏览:506
男中学生上课解压神器 浏览:373
加密狗拔掉之后怎么办 浏览:27